dssjon / biblos

www.biblos.app
http://www.biblos.app
Other
197 stars 14 forks source link

multiple languages? #33

Closed j2l closed 5 months ago

j2l commented 5 months ago

Hello, I'm looking for a way to do it for the Bible in other languages. I tried many ways with various RAG code locally and failed to get useful result. For example, if the query is "What God created in the beginning?" (in French with a French Bible in csv format, VPL), it will reply that the document is Deuteronomy 5, or another incoherent reply. despite "embedding_fulltext_search" (sqlite table) is correctly chunked per verse. I'm not sure if the embedding must be multi-lingual or else. So before trying this one, would please let me know if you successfully tried it with another language? Do you think it needs some adaptations? God bless you!

dssjon commented 5 months ago

I recommend visiting https://huggingface.co/spaces/mteb/leaderboard and checking out the "French" tab under the overall score table to review the latest state-of-the-art models. Based on the current rankings, a top French model is https://huggingface.co/dangvantuan/sentence-camembert-large. God bless you also! :-)

j2l commented 5 months ago

Thank you very much @dssjon ! Do you mean we can replace embedding model with any LLM model? I thought they were embedding models to be able to define vectors. PS: My bad, I didn't read the title "Massive Text Embedding "