Closed josephzpng closed 2 months ago
When I modified: output_ids = model.generate(inputs=input_ids, images=video, attention_mask=attention_masks, modalities="video", do_sample=False, temperature=0.0, max_new_tokens=1024, top_p=0.1,num_beams=1,use_cache=True, stopping_criteria=[stopping_criteria])
-->
output_ids = model.generate(inputs=input_ids, images=video, attention_mask=attention_masks, modalities=["video"], do_sample=False, temperature=0.0, max_new_tokens=1024, top_p=0.1,num_beams=1,use_cache=True, stopping_criteria=[stopping_criteria])
When I modified xx, I solved the problem.
Model name: LLaVA-NeXT-Video-7B
llava/model/llava_arch.py", line 309, in prepare_inputs_labels_for_multimodal
image_feature = unpad_image(image_feature, image_sizes[image_idx])
TypeError: 'NoneType' object is not subscriptable