Open Hank-Kuo opened 2 months ago
I see this issue with rope_scaling
for the model Phi-3.5-vision-instruct-q4f32_1-MLC
on HF.
Thank you folks for reporting. We'll look into it and get a fix. It looks like our recent support of phi3.5 unexpectedly breaks the phi3v compilation.
hi @Hank-Kuo , we fixed this issue in #2862, could you try again when this PR is merged into the main branch? thanks.
hi @mengshyu, it works for me, thanks for your help. And the third bug, still got this error:
ValueError: Check failed: (ptr->dl_tensor.ndim == ndim) is false: ErrorContext(fn=image_embed, loc=param[0], param=pixel_values, annotation=R.Tensor((1, 17, 3, 336, 336), dtype="float32")) expect Tensor with ndim 5 but get 4
I think that this file protocol/conversation_protocol.py
causes error.
Hi @Hank-Kuo, I think this is because the default image preprocessing is designed for LLava, not for the Phi-3 vision model. We are still working on creating a unified flow for all vision models. Could you use the example code in #2658 for the Phi-3 vision model? Thanks.
Hi @mengshyu,
Here is my example code in python
import base64
from mlc_llm import MLCEngine
image_path = "./image/test.jpeg"
model = "output/Phi-3-vision-128k-instruct-q4f16_1-MLC"
def load_image(path):
with open(image_path, "rb") as image_file:
image_base64 = base64.b64encode(image_file.read()).decode('utf-8')
return image_base64
# Create engine
engine = MLCEngine(model)
image_base64 = load_image(image_path)
# Run chat completion in OpenAI API.
for response in engine.chat.completions.create(
messages=[{"role": "user", "content": [
{
"type": "image_url",
"image_url": f"data:image/jpg;base64,{image_base64}",
},
{
"type": "text",
"text": """Describe this image"""
},
]}],
model=model,
stream=True,
):
for choice in response.choices:
print(choice.delta.content, end="", flush=True)
print("\n")
🐛 Bug
To Reproduce
Using this model Phi-3-vision-128k-instruct I got some bugs, need your help !!!
For phi3-v problem, when I converted model weight, I got
Error message
It seems like this part problem.
Also I want to quantize to q4f32_1, but q4f16_1 works for me, got error like that:
error message
And I set self.rope_ext_factors = None, then I ran model on local, I sent a message with image to server, also got error
input message
Error message
It seems like this part problem
Expected behavior
Environment
conda
, source):python pre-built packagepip
, source): pippython -c "import tvm; print('\n'.join(f'{k}: {v}' for k, v in tvm.support.libinfo().items()))"
, applicable if you compile models):