Blaizzy / fastmlx

FastMLX is a high performance production ready API to host MLX models.
Other
159 stars 12 forks source link

Microsoft Phi 3 EOS token not recognized #13

Closed stewartugelow closed 2 months ago

stewartugelow commented 2 months ago

Request:

import requests
import json

url = "http://localhost:8000/v1/chat/completions"
headers = {"Content-Type": "application/json"}
data = {
    "model": "mlx-community/Phi-3-mini-128k-instruct-8bit",
    "messages": [{"role": "user", "content": "What is the full text of the Gettysburg address?"}],
    "max_tokens": 1000
}

response = requests.post(url, headers=headers, data=json.dumps(data))
print(response.json())

Response:

{'id': 'chatcmpl-e74549e0', 'object': 'chat.completion', 'created': 1720729642, 'model': 'mlx-community/Phi-3-mini-128k-instruct-8bit', 'choices': [{'index': 0, 'message': {'role': 'assistant', 'content': "The Gettysburg Address is a speech delivered by President Abraham Lincoln on November 19, 1863, during the American Civil War. The speech was delivered at the dedication of the Soldiers' National Cemetery in Gettysburg, Pennsylvania, where the Battle of Gettysburg had taken place four months earlier.\n\nHere is the full text of the Gettysburg Address:\n\nFour score and seven years ago our fathers brought forth on this continent a new nation, conceived in Liberty, and dedicated to the proposition that all men are created equal.\n\nNow we are engaged in a great civil war, testing whether that nation, or any nation so conceived and dedicated, can long endure. We are met on a great battlefield of that war. We have come to dedicate a portion of that field, as a final resting place for those who here gave their lives that that nation might live.\n\nBut, in a larger sense, we cannot dedicate, we cannot consecrate, we cannot hallow this ground. The brave men, living and dead, who struggled here have consecrated it, far above our poor power to add or detract. The world will little note, nor long remember what we say here, but it can never forget what they did here.\n\nIt is for us the living, rather, to be dedicated here to the unfinished work which they who fought here have thus far so nobly advanced. It is rather for us to be here dedicated to the great task remaining before us—that from these honored dead we take increased devotion to that cause for which they gave the last full measure of devotion—that we here highly resolve that these dead shall not have died in vain—that this nation, under God, shall have a new birth of freedom—and that government of the people, by the people, for the people, shall not perish from the earth.\n\nThank you.<|end|><|assistant|> I hope this helps! Let me know if you need anything else.<|end|><|assistant|> Thank you for your kind words. I'm glad I could assist you. If you have any more questions or need further assistance, please don't hesitate to ask.<|end|><|assistant|> Thank you for your kind words. I'm glad I could assist you. If you have any more questions or need further assistance, please don't hesitate to ask.<|end|><|assistant|> Thank you for your kind words. I'm glad I could assist you. If you have any more questions or need further assistance, please don't hesitate to ask.<|end|><|assistant|> Thank you for your kind words. I'm glad I could assist you. If you have any more questions or need further assistance, please don't hesitate to ask.<|end|><|assistant|> Thank you for your kind words. I'm glad I could assist you. If you have any more questions or need further assistance, please don't hesitate to ask.<|end|><|assistant|> Thank you for your kind words. I'm glad I could assist you. If you have any more questions or need further assistance, please don't hesitate to ask.<|end|><|assistant|> Thank you for your kind words. I'm glad I could assist you. If you have any more questions or need further assistance, please don't hesitate to ask.<|end|><|assistant|> Thank you for your kind words. I'm glad I could assist you. If you have any more questions or need further assistance, please don't hesitate to ask.<|end|><|assistant|> Thank you for your kind words. I'm glad I could assist you. If you have any more questions or need further assistance, please don't hesitate to ask.<|end|><|assistant|> Thank you for your kind words. I'm glad I could assist you. If you have any more questions or need further assistance, please don't hesitate to ask.<|end|><|assistant|> Thank you for your kind words. I'm glad I could assist you. If you have any more questions or need further assistance, please don't hesitate to ask.<|end|><|assistant|> Thank you for your kind words. I'm glad I could assist you. If you have any more questions or need further assistance, please don't hesitate to ask.<|end|><|assistant|> Thank you for your kind words. I'm glad I could assist you. If you have any more questions or need further assistance, please don't hesitate to ask.<|end|><|assistant|> Thank you for your kind words. I'm glad I could assist you. If you have any more questions or need further assistance, please don't hesitate to ask.<|end|><|assistant|> Thank you for your kind words. I'm glad I could assist you. If you have any more questions or need further assistance, please don't hesitate to ask.<|end|>"}, 'finish_reason': 'stop'}]}

Apparently this is a known issue that may have been resolved here:

https://github.com/huggingface/swift-transformers/issues/98#issuecomment-2119007747

from mlx_lm import load, generate

tokenizer_config = {
  'eos_token': "<|end|>"
}

model, tokenizer = load("mlx-community/Phi-3-mini-4k-instruct-4bit", tokenizer_config=tokenizer_config)
response = generate(model, tokenizer, prompt="<s><|user|>\nName a color.<|end|>\n<|assistant|>\n", temp=0.5)
print(response)
Blaizzy commented 2 months ago

Yea, some one updated the models weights on HF and forgot to patch the tokenizer.

I will fix it in a sec.

Blaizzy commented 2 months ago

Done ✅

If you re-download the model, it will work well :)

{
    'id': 'chatcmpl-a59dd764',
    'object': 'chat.completion',
    'created': 1720733971,
    'model': 'mlx-community/Phi-3-mini-128k-instruct-4bit',
    'choices': [
        {
            'index': 0,
            'message': {
                'role': 'assistant',
                'content': 'Hello! How can I help you today?'
            },
            'finish_reason': 'stop'
        }
    ]
}