Closed Zizzzzzzz closed 5 days ago
Theoretically Cross Scan and Cross Merge are all the functions you need to modify,expect for the dimensions of parameters corresponding to it. However, Mamba-based models are more easily to break,so you need to carefully design your structure. Also, you should check your code to make sure that matching the approach in your mind.
I understand, thanks for the answer!
I want to increase the number of scanning paths (i.e. increase k_group to 6) and after modifying the scanning algorithms (CrossScan and CrossMerge in csms6s), I found that the model training does not converge. Can you please assist me, is there any other modifications I need to make besides CrossScan and CrossMerge?