MzeroMiko / VMamba

VMamba: Visual State Space Models,code is based on mamba
MIT License
1.82k stars 98 forks source link

increase k_group #243

Closed Zizzzzzzz closed 5 days ago

Zizzzzzzz commented 1 week ago

I want to increase the number of scanning paths (i.e. increase k_group to 6) and after modifying the scanning algorithms (CrossScan and CrossMerge in csms6s), I found that the model training does not converge. Can you please assist me, is there any other modifications I need to make besides CrossScan and CrossMerge?

MzeroMiko commented 1 week ago

Theoretically Cross Scan and Cross Merge are all the functions you need to modify,expect for the dimensions of parameters corresponding to it. However, Mamba-based models are more easily to break,so you need to carefully design your structure. Also, you should check your code to make sure that matching the approach in your mind.

Zizzzzzzz commented 5 days ago

I understand, thanks for the answer!