Responses are truncated

psugihara / FreeChat

llama.cpp based AI chat app for macOS

https://www.freechat.run

MIT License

437 stars 41 forks source link

Responses are truncated #34

Closed cleesmith closed 12 months ago

cleesmith commented 12 months ago

I've attached a screen capture of responses being truncated.

Also, an image of my Settings; just in case, and I am trying the: prometheus-13b-v1.0.Q5_K_M.gguf which seems similar to GPT-4 (sort of).

Changing the System Prompt did not seem to have any affect.

While less often, the truncation did happen with the default model too.

Thoughts?

FreeChat_truncated_responses

FreeChat_Settings

psugihara commented 12 months ago

Hm, my guess here is that it's an issue with Prompt format. It looks like this model is trained on a very unusual prompt format meant more for using programmatically than as a chat model: https://huggingface.co/kaist-ai/prometheus-13b-v1.0#prompt-format

I'd recommend trying MythoMist, Openhermes 2.5 or any of the top ranked ones here: https://www.otherbrain.world/?columnFilters=%5B%7B%22id%22%3A%22numParameters%22%2C%22value%22%3A%5B1%2C3%2C6%2C7%2C11%2C12%2C13%5D%7D%5D

psugihara commented 12 months ago

There is an update in main that will make it into the app store soon that fixes an issue with longer chats and may fix the truncation you're hitting occasionally with the default model.

cleesmith commented 12 months ago

Thanks I will keep an eye out, and try one of the other models.

psugihara commented 12 months ago

Sounds good. I appreciate all the candid feedback.

cleesmith commented 12 months ago

Thanks again, switching to: openhermes-2.5-neural-chat-7b-v3-1-7b.Q5_K_M.gguf on my old iMac 3.3GHz 6-core Intel i5 with 16GB, so now it does work, slowly, but it's time to upgrade to a M3 perhaps with more memory (if that helps). So far your app works well, and look forward to the updates.

psugihara commented 12 months ago

Beautiful. Yes, apple silicon is incredible for this stuff and I definitely recommend as much RAM as you can reasonably afford. With multi-modal models and breakthroughs at the 70B level I'm wishing I had more too :)