soilwise-he / natural-language-querying

Application component that provides Natural Language Querying (NLQ) services, making knowledge stored in a graph database accessible for e.g. a ChatBot UI.
MIT License
0 stars 0 forks source link

Decide which (human) languages need to be supported for NLQ #13

Closed robknapen closed 1 month ago

robknapen commented 2 months ago

LLMs vary in the amount of multilingual training data that has been used. Also, creating semantic embeddings is not the same for every language. Therefore it is import to decide early the scope of human languages that need to be supported.

BerkvensNick commented 1 month ago

@robknapen We do not have this requirement yet and I think we will only get it during the evaluation at the end of the first iteration with the UC and JRC. The EUSO dashboard website is in English (https://esdac.jrc.ec.europa.eu/esdacviewer/euso-dashboard/) implying JRC (ESDAC) currently 'serves' their users in English. Can we currently focus on English (knowing it is realistic that later in the process JRC will inquire if they can serve their users in the national language, but maybe we can evaluate our choices when that question comes), and when having to make choices go for the most multilingual LLM and embeddings-algorithm.

robknapen commented 1 month ago

Sure we can. Sounds like a good approach to focus on using multi-lingual embedding and language models, even though most will be in English first.

robknapen commented 1 month ago

Closed as answered.