Open NguyenVH01 opened 3 months ago
I did not encounter this problem, can you show me the traceback (in which layer of which stage this problem encounters)?
Actually when i run directly like your example it's working normal:
but when i run for training on my dataset, it's have trouble on load_pretrained_ema like this image:
I also tried to performance same config on checkpoint it run error same with my trouble
Can you help me to explain for my error. Thank you.
Thank you for your amazing work on the Mamba2 model. I am currently trying to load a pretrained model on VMamba Tiny-224 for image classification, but I encountered the following error:
File "vmamba.py", line 48, in _load_from_state_dict state_dict[prefix + "weight"] = state_dict[prefix + "weight"].view(self.weight.shape) RuntimeError: shape '[192, 96]' is invalid for input of size 9216
It seems that the shape of the weight tensor does not match the expected input size. Could you please provide guidance on how to resolve this issue? Is there a specific step I might be missing or a modification needed in the model architecture?
Thank you.