magic-research / PLLaVA

Official repository for the paper PLLaVA
465 stars 30 forks source link

Cannot run demo.sh with pllava-34b #25

Open AmitRozner opened 1 month ago

AmitRozner commented 1 month ago

Thanks for the repo and models! When trying to run demo.sh with the 34b model (commented and uncommented the relevant lines), I am getting nonsense output (with the example video and prompt):

###LM OUTPUT TEXT You are Pllava, a large vision-language assistant. 
You are able to understand the video content that the user provides, and assist the user with a variety of tasks using natural language.
Follow the instructions carefully and explain your answers in detail based on the provided video.
<|im_start|> user

<|im_start|> user
<|im_start|> user
What is the woman doing? 
<|im_start|> assistant
........................................................................................................................................................................................................
Conversation(system='You are Pllava, a large vision-language assistant. \nYou are able to understand the video content that the user provides, and assist the user with a variety of tasks using natural language.\nFollow the instructions carefully and explain your answers in detail based on the provided video.\n', roles=['<|im_start|>user\n', '<|im_start|>assistant\n'], messages=[['<|im_start|>user\n', '<image>\n'], ['<|im_start|>user\n', ''], ['<|im_start|>user\n', 'What is the woman doing?'], ['<|im_start|>assistant\n', '........................................................................................................................................................................................................']], sep=['<|im_end|>\n', '<|im_end|>\n'], mm_token='<image>\n', mm_style=<MultiModalConvStyle.MM_ALONE: 'mm_alone'>, pre_query_prompt=None, post_query_prompt=None, answer_prompt=None)
Answer: ........................................................................................................................................................................................................

It also takes like 10 minutes on 4 RTX 3090 GPUs. Any thoughts?

ermu2001 commented 1 month ago

It seems like all weights has not been loaded? Do you have the terminal output for the demo.

34B is relatively slow, it takes around 3 minute for a response on two A100. Mainly because 34b model provides long response.