haotian-liu / LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
https://llava.hliu.cc
Apache License 2.0
19.38k stars 2.13k forks source link

[Question] How to generate caption for images in dataset on hpc #351

Open ohhiohhi opened 1 year ago

ohhiohhi commented 1 year ago

Question

I wanna know how to generate caption for all the images in the dataset using llava in hpc, can anyone tell me?

haotian-liu commented 1 year ago

Hi, you can refer to model_vqa.py for the inferecing on all images in the dataset.

Reference instructions (the first point suffices): https://github.com/haotian-liu/LLaVA#gpt-assisted-evaluation

ohhiohhi commented 1 year ago

thanks,I tried what you said, but a new problem arose@haotian-liu

When I use model_vqa.py for caption, the generated text is garbled,i didn't change file model_vqa.py

pretrain model:llava-llama-2-13b-chat-lightning-preview

Run the following code python -m llava.eval.model_vqa \ --model-path "./llava-llama-2-13b-chat-lightning-preview" \ --question-file ./playground/test/test.jsonl \ --image-folder ./llava/serve/examples/ \ --answers-file ./playground/test/test_answer.jsonl

prompt:caption the image in 30 words or less Generates garbled results for all images,such as:c. sec: router devgn #

At first I thought it was a model issue, i tried pretrained model LLaVA-Lightning-MPT-7B-preview,the output is still a mess,

After I thought it was an environment issue, but when using llama-7b-hf for QA tasks, the output is normal

Now I don't know why.

fj6833 commented 12 months ago

Hello, may I ask if you have resolved this issue? I have the same question and would you please provide me with a contact information? Thank you

ohhiohhi commented 11 months ago

sorry, I didn't fix that, we can get in touch via my email,18191547735@163.com

jameszhou-gl commented 10 months ago

Hi @ohhiohhi , just a discussion. I ever tried to use LLaVA to generate captions on ScienceQA dataset. It works well, at least no garbled results found. Have you tried on ScienceQA dataset using llava.eval.model_vqa? I translated sqa format into vqa format firstly.

ohhiohhi commented 10 months ago

Thanks for your reply , I solved the problem by reconfiguring the environment

FurkanGozukara commented 10 months ago

Thanks for your reply , I solved the problem by reconfiguring the environment

hi what is the prompt you are giving?

@jameszhou-gl which prompt you used to caption images?

ohhiohhi commented 10 months ago

Thanks for your reply , I solved the problem by reconfiguring the environment

hi what is the prompt you are giving?

@jameszhou-gl which prompt you used to caption images?

'caption the image with keywords'