BAAI-DCAI / Bunny

A family of lightweight multimodal models.
Apache License 2.0
799 stars 61 forks source link

Bunny-Llama-3-8B-V #60

Closed believewhat closed 2 months ago

believewhat commented 2 months ago

I directly used the code from huggingface to generate the answer. But got some starnge symbols such as: !!!!!!!!!!!!

Could you please help me?

Isaachhh commented 2 months ago

Could you please show us more information about your image and text prompt?

believewhat commented 2 months ago

Could you please show us more information about your image and text prompt?

Text prompt comes from the example which post on huggingface(https://huggingface.co/BAAI/Bunny-Llama-3-8B-V):

prompt = 'Why is the image funny?' text = f"A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: \n{prompt} ASSISTANT:"

Image is the icon.png

The code is the same as the huggingface's example. The only different is that I didn't use repo id but local path to load the model.

Isaachhh commented 2 months ago

I tried and got

The image is funny because it depicts a rabbit, which is not typically associated with technology or space travel, using a laptop and wearing a space suit. This juxtaposition of a cute, domestic animal in a futuristic setting creates a humorous contrast.

You may git pull to make sure your Bunny-Llama-3-8B-V weights are up-to-date.

believewhat commented 2 months ago

I tried and got

The image is funny because it depicts a rabbit, which is not typically associated with technology or space travel, using a laptop and wearing a space suit. This juxtaposition of a cute, domestic animal in a futuristic setting creates a humorous contrast.

You may git pull to make sure your Bunny-Llama-3-8B-V weights are up-to-date.

1714361819511

I think my model's weights are up-to-date. But I still got this output: "!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!"

believewhat commented 2 months ago

Could you please upload your requirement.txt? Thank you very much.

believewhat commented 2 months ago

I tried and got

The image is funny because it depicts a rabbit, which is not typically associated with technology or space travel, using a laptop and wearing a space suit. This juxtaposition of a cute, domestic animal in a futuristic setting creates a humorous contrast.

You may git pull to make sure your Bunny-Llama-3-8B-V weights are up-to-date.

I also want to confirm that whether we are using the same code: https://huggingface.co/BAAI/Bunny-Llama-3-8B-V and I didn't use repo id to load the model.

I also tried to load the model by repo id: 'BAAI/Bunny-Llama-3-8B-V' and got "nét!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!".

Isaachhh commented 2 months ago

Could you please upload your requirement.txt? Thank you very much.

I just pip install torch transformers accelerate pillow and I use the code in our model page.

You may uncomment the "disable some warnings" part to see whether there are any warnings.

And you may print the input_ids to see whether the tokenizer works well.

believewhat commented 2 months ago

Could you please upload your requirement.txt? Thank you very much.

I just pip install torch transformers accelerate pillow and I use the code in our model page.

You may uncomment the "disable some warnings" part to see whether there are any warnings.

And you may print the input_ids to see whether the tokenizer works well.

I only got this warning: warnings.warn(f'for {key}: copying from a non-meta parameter in the checkpoint to a meta ' /home/jwang/anaconda3/envs/bunny/lib/python3.10/site-packages/torch/nn/modules/module.py:2025: UserWarning: for vision_model.head.mlp.fc2.bias: copying from a non-meta parameter in the checkpoint to a meta parameter in the current model, which is a no-op. (Did you mean to pass assign=True to assign items in the state dictionary to their corresponding key in the module instead of copying them in place?)

For input_ids: 1714397306660

I guess decoder function could work well?

believewhat commented 2 months ago

I found that if I do inference on A100, it could work.... But if I do inference on H100, it doesn't work... strange

Isaachhh commented 2 months ago

I found that if I do inference on A100, it could work.... But if I do inference on H100, it doesn't work... strange

What if you try dtype=torch.bfloat16 on H100?

believewhat commented 2 months ago

I tried but it doesn't work. But that's ok, I could use other gpu to inference. Thank you very much.