Open elsatch opened 7 months ago
Checking the output of backend-1:
backend-1 | 1. Format your answer (after AI:) in markdown. backend-1 | 2. You have to use your tools to answer questions. backend-1 | 3. You have to provide the sources / links you've used to answer the quesion. backend-1 | 4. You may use tools more than once.
Maybe this could be created as an additional instruction, or as a modification to the instruction number 1.
Thanks for the praise :).
I could not solve this problem in my own testing (at least not reliable). I think the main problem (at least with small llms) is training data. Even with the internal prompt written in German, the output is still English :/
Forcing it to answer in German results in a way worse result than using it in English. I would be happy to export the "system prompt" / wrapper prompt to the webui to allow customization and better testing, would this be an acceptable solution? I just think there isn't a catch-all solution.
This might be different with bigger models, but i don't have the hardware or money to run them haha.
Im going to test "command-r" which is a 30b+ model, but its going to take a couple of minutes.
I'm also not happy with this wrapper solution in general. I don't like the idea to inject things into people's prompts, but I had some reliability problems with system prompts.
this seems to work alright with a big model (command-r-q8) but i think this really depends on the training data. Command-r was trained on multiple languages:
Command-R has the capability for multilingual generation evaluated in 10 languages and highly performant RAG capabilities
I'm open to merge a working solution, but we have to keep in mind that every bit of extra prompt complexity will degrade the performance of small LLMs.
Coincidentially, I am downloading Command-R too! Indeed my test question for the response was:
Question: Describe the features of the command R model released by Cohere yesterday
When I asked in Spanish "Describe in Spanish the features of the model" it replied that the Command-R is an R package with 35B and that you can change the language of the interface R Commnander to work in Spanish easily.
Mixtral 8x7B is officially supporting English, French, Spanish, German and a few other languages. But it's way too large.
Let me search for actual recommendations about how to direct LLMs to output in a particular language. I'll start with Mistral family models. (And I know the recommendations might not transfer to other models).
Maybe being able to tune the instructions from the UI could help to adapt to several scenarios.
https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2
Looks like the embeddings model was only trained on English text? Maybe changing this would also improve the results
mistral results using "nomic-embed-text" instead of "minilm". Results are good btw haha, i forgot that not everyone can judge that
Damn I'm going to switch the whole project to nomic, it's a bit larger but performs way better haha.
I could not find info about Nomic model being multilingual on their technical info but... if it works, it works. The fact that it works will Ollama is also quite nice.
After some reading it seems like the most performant small multilingual embedding model is:
https://huggingface.co/intfloat/multilingual-e5-small
But Nomic also seems to be a very good choice!
Source (behind the paywall of doom of Medium): https://medium.com/p/40325c308ebb
looks good to me, right? using the default model (hermes 2 pro mistral 7b)
Looks much better indeed!
I could not find info about Nomic model being multilingual on their technical info but... if it works, it works. The fact that it works will Ollama is also quite nice.
After some reading it seems like the most performant small multilingual embedding model is:
https://huggingface.co/intfloat/multilingual-e5-small
But Nomic also seems to be a very good choice!
Source (behind the paywall of doom of Medium): https://medium.com/p/40325c308ebb
i will look into testing different embedding models, but i have to do some real work now, sorry haha
thanks for the links tho :) i hope the hotfix (just pushed containers) works okay
Sure! Thanks for your prompt response!
Back to real work here too (on embeddings, precisely!). I will test the containters next week and report back improvements.
Have a nice weekend!
just exposed the system prompt and a lot of other settings to the frontend, might improve testing
Thanks for the update! I might need to create an updated video soon!
Original video review (in Spanish): https://youtu.be/eoWz7hLb-gA
First of all, thanks @nilsherzig for getting this wonderful project together!
When searching in other non English languages, LlocalSearch responds in English by default.
Prompt engineering in the search string doesn't work properly. Things like:
This happens as the search query is passed to the llm as the string to work upon, not the instructions to produce the output.
To fix this behavior, the language "selection" should be outside of search query. I suspect you have a base template for the system role. So, the way to fix this would be to add a sentence like: "Create your reply in the same language as the search string. When you are not able to determine the search string input, default to English."
Some additional variations could be crafted and tested to make sure that output is produced in the same language as input.
I am happy to support testing these changes