palladius / gemini-news-crawler

A GenAI news crawler in Ruby leveraging Gemini multimodality ability
MIT License
13 stars 1 forks source link

embedding retrieval broken: ["the server responded with status 404",null] #4

Open palladius opened 1 month ago

palladius commented 1 month ago

It looks like Embeddings demo is broken, since Embeddings retrieval is now broken (possibly text-embeddings-001 is now unbenutzable.

What's worse is that if I can fix it to work with new embeddings (eg the @004) I still need to regenerate 10k document. which might take some time.

palladius commented 1 month ago

Good news. I've changed the code to use Gemini code (v2) to calculate embeddings, however this creates a split brain. where's the good news? that i now populate also two fields, so i can actually query them easily.

irb(main):011> Article.where.not(title_embedding_description: nil).count
  Article Count (42.9ms)  SELECT COUNT(*) FROM "articles" WHERE "articles"."title_embedding_description" IS NOT NULL
=> 2

I've also added a nice visualization Article.llm_info.

palladius commented 1 month ago

I think I've fixed this in recent push -but still keeping open for a while untiol i have time to refresh ALL embeddings, and possibly move to new ones.