marella / chatdocs

Chat with your documents offline using AI.
MIT License
690 stars 100 forks source link

Prompting in another language sometimes gives a english answer #27

Open Ananderz opened 1 year ago

Ananderz commented 1 year ago

Has anyone figured out how to get it to just prompt in the language you are using to ask questions. It jumps a little back and forth between english and the other language.

MyraBaba commented 1 year ago

@marella Is other language ie: Spanish , Dutch and Turkish supported ? Which model is the best for those?

marella commented 1 year ago

I don't think the default models are the best for all languages. I'm also not familiar which models are best for other languages.

I would suggest asking for model suggestions (language specific or multilingual) in forums like r/LocalLLaMA You would need both embeddings model and LLM that work well with a language.

For embeddings model, you can also look at some of the multilingual models in https://huggingface.co/sentence-transformers#models For example, https://huggingface.co/sentence-transformers/paraphrase-multilingual-mpnet-base-v2

wypiki commented 1 year ago

During extensive testing I found out that as the main model Guanaco 65B works best for german and https://huggingface.co/intfloat/multilingual-e5-large for the multilingual embeddings.

Ananderz commented 1 year ago

Could potentially work with adding a prompt tempate if we can use conversationalretriverqa to tell it to only reply in the specific language. Yes you need a LARGE LLM like the Guanaco 65B or Falcon 40B which is trained on the specific languages you need + e5 multilingual sentence transformer