IuvenisSapiens / ComfyUI_MiniCPM-V-2_6-int4

The implementation of MiniCPM-V-2_6-int4 has been seamlessly integrated into the ComfyUI platform, enabling the support for text-based queries, video queries, single-image queries, and multi-image queries to generate captions or responses.
Apache License 2.0
130 stars 8 forks source link

shape mismatch: value tensor of shape [1104] cannot be broadcast to indexing result of shape [1035] #21

Open Pancat007 opened 2 months ago

Pancat007 commented 2 months ago

Error occurred when executing MiniCPM_VQA:

shape mismatch: value tensor of shape [1104] cannot be broadcast to indexing result of shape [1035]

File "D:\StableDiffusion\ComfyUI-aki-v1.3\execution.py", line 317, in execute output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb) File "D:\StableDiffusion\ComfyUI-aki-v1.3\execution.py", line 192, in get_output_data return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb) File "D:\StableDiffusion\ComfyUI-aki-v1.3\execution.py", line 169, in _map_node_over_list process_inputs(input_dict, i) File "D:\StableDiffusion\ComfyUI-aki-v1.3\execution.py", line 158, in process_inputs results.append(getattr(obj, func)(inputs)) File "D:\StableDiffusion\ComfyUI-aki-v1.3\custom_nodes\ComfyUI_MiniCPM-V-2_6-int4\nodes_legacy.py", line 253, in inference result = self.model.chat( File "D:\StableDiffusion\ComfyUI-aki-v1.3.cache\huggingface\modules\transformers_modules\MiniCPM-V-2_6-int4\modeling_minicpmv.py", line 380, in chat res = self.generate( File "D:\StableDiffusion\ComfyUI-aki-v1.3.cache\huggingface\modules\transformers_modules\MiniCPM-V-2_6-int4\modeling_minicpmv.py", line 256, in generate ) = self.get_vllm_embedding(model_inputs) File "D:\StableDiffusion\ComfyUI-aki-v1.3.cache\huggingface\modules\transformers_modules\MiniCPM-V-2_6-int4\modeling_minicpmv.py", line 117, in get_vllm_embedding vision_embedding = self.vpm(all_pixel_values, patch_attention_mask=patch_attn_mask, tgt_sizes=tgt_sizes).last_hidden_state File "D:\StableDiffusion\ComfyUI-aki-v1.3\python\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "D:\StableDiffusion\ComfyUI-aki-v1.3\python\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl return forward_call(args, kwargs) File "D:\StableDiffusion\ComfyUI-aki-v1.3\python\lib\site-packages\accelerate\hooks.py", line 165, in new_forward output = module._old_forward(*args, kwargs) File "D:\StableDiffusion\ComfyUI-aki-v1.3.cache\huggingface\modules\transformers_modules\MiniCPM-V-2_6-int4\modeling_navit_siglip.py", line 903, in forward hidden_states = self.embeddings(pixel_values=pixel_values, patch_attention_mask=patch_attention_mask, tgt_sizes=tgt_sizes) File "D:\StableDiffusion\ComfyUI-aki-v1.3\python\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "D:\StableDiffusion\ComfyUI-aki-v1.3\python\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl return forward_call(args, kwargs) File "D:\StableDiffusion\ComfyUI-aki-v1.3\python\lib\site-packages\accelerate\hooks.py", line 165, in new_forward output = module._old_forward(*args, **kwargs) File "D:\StableDiffusion\ComfyUI-aki-v1.3.cache\huggingface\modules\transformers_modules\MiniCPM-V-2_6-int4\modeling_navit_siglip.py", line 349, in forward position_ids[batch_idx][p_attn_mask.view(-1).cpu()] = pos_ids