bug: Phi 3 Mini Model Generates Random Responses

janhq / jan

Jan is an open source alternative to ChatGPT that runs 100% offline on your computer. Multiple engine support (llama.cpp, TensorRT-LLM)

https://jan.ai/

GNU Affero General Public License v3.0

23.22k stars 1.35k forks source link

bug: Phi 3 Mini Model Generates Random Responses #3023

Closed richardstevenhack closed 4 months ago

richardstevenhack commented 4 months ago

Describe the bug I am using Phi 3 mini model which I migrated from LMStudio to Jan. This model was working fine in LMStudio. In Jan, upon starting Jan and selecting the model, and typing "who are you", I get its basic introduction, but then it goes off the rails and starts putting out answers to questions I haven't asked. See the screenshot. This does NOT happen on LMStudio. The Phi models are known for being overly talkative but this is ridiculous.

Steps to reproduce Steps to reproduce the behavior:

Start Jan.
Select model.
Type "Who are you?"
See screenshot:

Expected behavior The simple response which the model would ordinarily use.

Screenshots See above.

Environment details

Operating System: latest openSUSE Tumbleweed Linux with KDE desktop.
Jan Version: 0.5.0
Processor: Ryzen 9 5950X
RAM: 64GB
Running CPU only - AMD Radion RX 550 2GB GPU

Logs app.log

Additional context Microsoft Phi model apparently are known for excessive verbiage, but this is simply off the wall and doesn't happen with Phi 3 mini on LMStudio or AnythingLLM. Someone suggested it's ignoring prompt format stop codes, but I don't know how to adjust that in Jan.I tried this template: <|user|> {User} <|end|><|assistant|> {Assistant} in "Model Parameters" but that had no effect and Model Parameters does not persist across reboots. See here on HiggingFace for the verbiage issue: https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-gguf/discussions/8

TikoTako commented 4 months ago

you have the wrong prompt and stop tokens, it just write out the whole training dataset edit: i just did a 30 minutes test, and... is not jan is the model that has been trained with a faulty dataset how is this even possible 😦

TikoTako commented 4 months ago

Are you sure you are using the Microsoft gguf release? microsoft/Phi-3-mini-4k-instruct-gguf Phi-3-mini-4k-instruct-q4.gguf

I had that and this version phi3-ita-mini-4k-instruct.Q8_0.gguf, the Microsoft one is working fine while the other one is broken, is a mess with the tokens.

richardstevenhack commented 4 months ago

Well, as I said, I moved the one I was using in LMStudio to Jan. It was working fine in LMStudio. I suspect it has something to do with the prompt and stop tokens, but I don't know enough which ones to use. The only prompt template I see in Jan is this one: {system_message}

Instruction: {prompt}

Response:

Someone said on HuggingFace that the Microsoft model doesn't even respect a system message.

I am using Phi-3-mini-4k-instruct-gguf. This isn't the Q4 version. Perhaps you're right and the version I downloaded is something different. I also have the Phi-3-mini-4k-instruct-fp16.gguf version. I just asked that one the "Who are you?" question and got back this more shortened answer (albeit still repetitive) `I am an AI digital assistant created to help answer your questions and assist with various tasks. How can I help you today? <|assistant|> Hello! I'm an AI digital assistant, here to answer your questions and provide assistance. How can I assist you today?

<|assistant|> Hello! I'm an AI here to help you. How can I serve you today?

<|assistant|> Hello! I'm an artificial intelligence designed to provide assistance and information. How may I help you today?<|endoftext|> `

So perhaps the other one is indeed broken. I still wonder about the effect of the system prompt. Will those model parameters work properly for any model or do they need to be adjusted per model?

TikoTako commented 4 months ago

Maybe LMStudio have a end token list with all the possible tokens and use that with any model so it can stop anyway.

Jan have a model.json that is specific for each model, so if you use a model that has been trained with a wrong dataset or tokenizer it cant stop when the model generate something with a wrong token.

Each model have different prompts/tokens : <|user|>\n{prompt}<|end|>\n<|assistant|> "stop": ["<|end|>"] [INST] {prompt} [/INST] "stop": ["[/INST]"] <|START_OF_TURN_TOKEN|><|SYSTEM_TOKEN|>{system_prompt}<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|USER_TOKEN|>{prompt}<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|> "stop": ["<|END_OF_TURN_TOKEN|>"] <|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\n{system_message}<|eot_id|><|start_header_id|>user<|end_header_id|>\n\n{prompt}<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n "stop": ["<|end_of_text|>","<|eot_id|>"] etc...

The first 2 don't have the system the other two have them, and as you can see the stop token is different.

In my case phi3-ita-mini-4k-instruct.Q8_0.gguf was trained i guess with few datasets picked "ad minchiam", it generate the text with more than 3 different token sets instead of the phi ones.

bad tokens

richardstevenhack commented 4 months ago

I see that Jan has a "assistant,json" for the "default assistant" that goes like this: "retrieval_template": "Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.\n----------------\nCONTEXT: {CONTEXT}\n----------------\nQUESTION: {QUESTION}\n----------------\nHelpful Answer:"

The one for the Phi 3 model shows: "prompt_template": "{system_message}\n### Instruction: {prompt}\n### Response:" and also provides under the "parameters" section: "stop": [ "<endofstring>".

The responses I get from the model are of the form: <|assistant|> Hello! I'm an artificial intelligence designed to provide assistance and information. How may I help you today?<|endoftext|>

So that does seem to be some sort of mismatch. I just don't know whether that's an issue with the model or the way Jan is interpreting these json settings...

I'm going to close this bug report out since I can't confirm it's a Jan bug vs a model bug. I need to get more understanding into these sorts of issues which I'm sure I will over time.

TikoTako commented 4 months ago

As you can see in Microsoft Phi-3-mini-4k-instruct-gguf the correct prompt for phi-3 is: <|user|>\n{prompt}<|end|>\n<|assistant|> and stop=["<|end|>"], I guess you don't have the original microsoft gguf if it reply <|endoftext|>

jan\assistants\jan\assistant.json is the "generic" settings for the assistant

jan\models\**MODELNAME**\model.json this is the base model settings, when you crate a new chat or change the model it use that

jan\threads\jan_**SOMEID**\thread.json this is the specific settings file for that specific thread (that is the actual chat), it have the settings you use in that specific chat

richardstevenhack commented 4 months ago

Thanks. I think I'll try using the Microsoft prompt template and see if that helps. If not, I'll just dump that model and try something else. I just installed MSTY which is very nice and I have installed Llama3 8B to experiment with that.

I see the Phi-3 8B and phi 3 medium have the prompts you mentioned. So that is probably the issue. I am now downloading he Phi 3 medium and will see if that is an improvement.

Thanks for your help. It's been very instructive.