google-ai-edge / mediapipe-samples

Apache License 2.0
1.63k stars 418 forks source link

Phi-2 Inference has some issues. #380

Open areebbashir opened 6 months ago

areebbashir commented 6 months ago

When I'm running phi-2 on device there is a issue while generating the responses. When I feed it a question it starts generating the response(although quite slow) but it just doesn't stop. In ChatViewModel inside sendMessage Api, inferenceModel.partialResults always sends the done as false so the model keeps questioning itself and answering the question itself only. 20240501_164239

areebbashir commented 6 months ago

Turns out this issue is with every other model that I use except for gemma.

vittalitty commented 6 months ago

@areebbashir Hello, recently I have also been trying to run the phi-2 model based on an Android device. However, I encountered an error while converting the model to a compatible mediapipe format, such as not being able to find the file where model_ckpt_util is located. My Python version is 3.9, and mediapipe versions are 0.10.11 and 0.10.13, both of which cannot run the following script properly. Can you point out the problem? Thank you very much!

import mediapipe as mp from mediapipe.tasks.python.genai import converter config = converter.ConversionConfig( input_ckpt="E:/PythonProject/models/phi-2/model-00002-of-00002.safetensors", ckpt_format="safetensors", model_type="PHI_2", backend="gpu", output_dir="E:/PythonProject/models/phi-2_output/output", combine_file_only=False, vocab_model_file="E:/PythonProject/models/phi-2", output_tflite_file="E:/PythonProject/models/phi-2_output/phi_2_model_gpu.bin" ) converter.convert_checkpoint(config)

Running error: Traceback (most recent call last): File "E:\PythonProject\pythonProject2\convert_to_api.py", line 17, in converter.convert_checkpoint(config) File "E:\PythonProject\pythonProject2.venv\lib\site-packages\mediapipe\tasks\python\genai\converter\llm_converter.py", line 251, in convert_checkpoint vocab_model_path = convert_bpe_vocab( File "E:\PythonProject\pythonProject2.venv\lib\site-packages\mediapipe\tasks\python\genai\converter\llm_converter.py", line 193, in convert_bpe_vocab model_ckpt_util.ConvertHfTokenizer(vocab_model_file, output_vocab_file) AttributeError: module 'mediapipe.python._framework_bindings.model_ckpt_util' has no attribute 'ConvertHfTokenizer'

areebbashir commented 6 months ago

I have not encountered this issue as of yet. But I had some other import errors when in Pip installed the mediapipe. Then I used pip install mediapipe --user which worked for me.

Also you are putting the wrong path in input_ckpt. It need to be absolute path to the folder containing the models of phi. Try thus import mediapipe as mp from mediapipe.tasks.python.genai import converter config = converter.ConversionConfig( input_ckpt="E:/PythonProject/models/phi-2", ckpt_format="safetensors", model_type="PHI_2", backend="gpu", output_dir="E:/PythonProject/models/phi-2_output/output", combine_file_only=False, vocab_model_file="E:/PythonProject/models/phi-2", output_tflite_file="E:/PythonProject/models/phi-2_output/phi_2_model_gpu.bin" ) converter.convert_checkpoint(config)

PaulTR commented 6 months ago

I have a feeling the 'stop' token is potentially different for the non-Gemma models, and would need to be updated in the app, but I'll need to verify that. I'll add this to my TODO list for after IO!

PaulTR commented 6 months ago

@areebbashir Hello, recently I have also been trying to run the phi-2 model based on an Android device. However, I encountered an error while converting the model to a compatible mediapipe format, such as not being able to find the file where model_ckpt_util is located. My Python version is 3.9, and mediapipe versions are 0.10.11 and 0.10.13, both of which cannot run the following script properly. Can you point out the problem? Thank you very much!

import mediapipe as mp from mediapipe.tasks.python.genai import converter config = converter.ConversionConfig( input_ckpt="E:/PythonProject/models/phi-2/model-00002-of-00002.safetensors", ckpt_format="safetensors", model_type="PHI_2", backend="gpu", output_dir="E:/PythonProject/models/phi-2_output/output", combine_file_only=False, vocab_model_file="E:/PythonProject/models/phi-2", output_tflite_file="E:/PythonProject/models/phi-2_output/phi_2_model_gpu.bin" ) converter.convert_checkpoint(config)

Running error: Traceback (most recent call last): File "E:\PythonProject\pythonProject2\convert_to_api.py", line 17, in converter.convert_checkpoint(config) File "E:\PythonProject\pythonProject2.venv\lib\site-packages\mediapipe\tasks\python\genai\converter\llm_converter.py", line 251, in convert_checkpoint vocab_model_path = convert_bpe_vocab( File "E:\PythonProject\pythonProject2.venv\lib\site-packages\mediapipe\tasks\python\genai\converter\llm_converter.py", line 193, in convert_bpe_vocab model_ckpt_util.ConvertHfTokenizer(vocab_model_file, output_vocab_file) AttributeError: module 'mediapipe.python._framework_bindings.model_ckpt_util' has no attribute 'ConvertHfTokenizer'

Hey @vittalitty if you're still running into this issue after the feedback in the previous comment, can you put it into a new issue for tracking? Thanks!

areebbashir commented 6 months ago

I have a feeling the 'stop' token is potentially different for the non-Gemma models, and would need to be updated in the app, but I'll need to verify that. I'll add this to my TODO list for after IO!

Thanks, In the meanwhile can you suggest what I could try from my end.

gagangayari commented 5 months ago

Is there any API to stop generation based on some criteria?

talumbau commented 2 weeks ago

Thanks for trying out the API. We want the API to work with essentially any open model, so let's see if this is a config issue.

Can you show the Python code you are using to make the task bundle, as shown here

For Phi-2 it looks like token 50256 is both the beginning of sequence and end of sequence token. According to the config here:

https://huggingface.co/microsoft/phi-2/blob/main/tokenizer_config.json#L5

That would be written out as <|endoftext|>. So then when you are creating the BundleConfg object, I think you would have:

start_token="<|endoftext|>"

and

stop_tokens=["<|endoftext|>"],

(since stop_tokens requires a list). If your BundleConfig creation doesn't include those lines, please give that a try. Thanks!