ocdevel / gnothi

Gnothi is an open-source AI journal and toolkit for self-discovery. If you're interested in getting involved, we'd love to hear from you.
https://gnothiai.com
GNU Affero General Public License v3.0
174 stars 19 forks source link

Books: CosineEstimator - more model optimizations via NAS #104

Closed lefnire closed 1 year ago

lefnire commented 4 years ago

I've spent a fair bit of time hyper-optimizing the architecture of CosineEstimator. One more change I want to try is is alternating Relu/Tanh activations via medium. Better yet, scrap all that hyper-opt code & switch to Neural Architecture Search (NAS) (towardsdatascience). I'm not sure what libraries are out there, and is this the same thing as AutoML?

Also, try mixing TF-IDF score (user keywords <-> book keywords) with BERT scores. Maybe 50/50

lefnire commented 1 year ago

CosineEstimator is removed. User votes now works like this:

  1. Save the book_id and direction (like=+1, dislike=-1, already_read=1, remove=0, etc)
  2. The books route is real-time now, based on filters. So the user.embedding (saved in S3) is retrieved for use in cosine similarity
  3. Before doing semantic_search with sentence_transformers (cosine similarity), take every book from the books.feather file, which the user has voted (by book_id) and move the user-vector a smidge in the direction. Eg, embedding * learning_rate * direction.

So no more ML here; just simple vector-math. It actually works waaaay better anyway (subjectively based on my voting); is less error-prone; less computationally expensive; and allows for real-time.