SciSharp / LLamaSharp

A C#/.NET library to run LLM (🦙LLaMA/LLaVA) on your local device efficiently.
https://scisharp.github.io/LLamaSharp
MIT License
2.26k stars 300 forks source link

Wrong result when change to other model. #481

Open icemaple1251 opened 5 months ago

icemaple1251 commented 5 months ago

It works well when I use LLama2-7b-Chat, but when I changed the model to a new version mixtral-8x7b-v0.1Q2_K, when I ask the same question it seems that the robot gave a wrong answer, and it even changed my original question.

Should I change some options or parameters some where when I change to another model? Anyone can help me? thanks. wrong correct

martindevans commented 5 months ago

Q2 is a pretty small quantisation, have you tested your Q2 model in llama.cpp directly to check this isn't just a bad response caused by the quantisation?

icemaple1251 commented 4 months ago

I have not tested your Q2 model in llama.cpp directly. But I do have try other models like "mixtral-8x7b-v0.1.Q8_0.gguf" I still get wo wrong answer, some answers may be repeated for several times. If some models are special for chat but others are not?

martindevans commented 4 months ago

The mixtral model you mentioned is Q8, which is much more forgiving than Q2. The smaller than number the more the model has been compressed, and the more likely it is to give bad answers.