deepset-ai / haystack

:mag: AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.
https://haystack.deepset.ai
Apache License 2.0
16.82k stars 1.85k forks source link

Does it work with spanish? #940

Closed HighDeFing closed 3 years ago

HighDeFing commented 3 years ago

Question I was wondering if it works with Spanish document databases?

Additional context Also is it possible to use mongodb?

tholor commented 3 years ago

Hey @HighDeFing I suppose you mean a spanish question answering / reader model? There a couple of those available on HF's model hub. For example: https://huggingface.co/mrm8488/bert-base-spanish-wwm-cased-finetuned-spa-squad2-es

However, one word of caution: I think most of these models are trained on automatically translated versions of SQuAD. I don't know for Spanish, but for german we didn't get great performance from these type of models. Another alternative worth a comparison might be a multilingual model (e.g. xlm roberta)

Regarding mongo: no, it's not supported as off now.

Hope this helps!

HighDeFing commented 3 years ago

Oh thank you yes, I was wondering if BETO (Spanish BERT) could work with haystack, I'd give this a try.