openvinotoolkit / openvino.genai

Run Generative AI models with simple C++/Python API and using OpenVINO Runtime
Apache License 2.0
152 stars 171 forks source link

Chat streaming is not working with Phi 2 #617

Closed rupeshs closed 3 months ago

rupeshs commented 4 months ago

Used sample code: https://github.com/openvinotoolkit/openvino.genai/blob/master/samples/python/chat_sample/chat_sample.py

https://github.com/user-attachments/assets/59988c21-7cbe-4b29-998f-167d67c5fd02

YuChern-Intel commented 3 months ago

To use Phi-2 model, run with beam_search_causal_lm, greedy_causal_lm or multinomial_causal_lm.

rupeshs commented 3 months ago

@YuChern-Intel This is not mentioned in the documentation see https://github.com/openvinotoolkit/openvino.genai/blob/master/src/README.md.

image

YuChern-Intel commented 3 months ago

I had validated in running the microsoft/phi-2 model with these demos.

olpipi commented 3 months ago

Hi @rupeshs The root cause is phi-2 model doesn't have chat template. Probably it is not tuned as chat. That's why it doesn't have template. Genai pipeline crashes when you start chat scenario without chat template. I will fix it to return correct error.

If you want to run chat with phi-2 model, you should add chat_template field into tokenizer_config.json. See phi-3 for example. I tried this template: "chat_template": "{% for message in messages %}{% if (message['role'] == 'user') %}{{'Instruct: ' + message['content'] + '\nOutput:'}}{% elif (message['role'] == 'assistant') %}{{message['content'] + '\n'}}{% endif %}{% endfor %}" It is not official, so I cannot guarantee it will work properly

olpipi commented 3 months ago

https://github.com/openvinotoolkit/openvino.genai/pull/697 - fix segfault