Open evertonipx opened 9 months ago
Following as well, getting this exact issue
It seems that the updated code does not deal with cls_token well.
See https://github.com/X-PLUG/mPLUG-Owl/commit/54b508a7254621977c8d662d203bd0d3c8a7e428
If modify if embeddings.shape[1] != self.num_patches:
-> if self.cls_token is None and embeddings.shape[1] != self.num_patches:
, it can work.
Fixed.
When I use only prompt text mPLUG-Owl2 works fine. But when I include an image have this error:
File "C:\py projects\IPXCopilot_OWLVersion\mplug_owl2\model\visual_encoder.py", line 117, in forward if self.cls_token : RuntimeError: Boolean value of Tensor with more than one value is ambiguous
If I change to:
if self.cls_token is not None:
I got this error:File "C:\py projects\IPXCopilot_OWLVersion\mplug_owl2\model\visual_encoder.py", line 123, in forward embeddings = embeddings + get_abs_pos(self.position_embedding,embeddings.size(1)) RuntimeError: The size of tensor a (1024) must match the size of tensor b (1049600) at non-singleton dimension 2
Anyone with the same problem? Worked fine before the update