Hidden State understanding

Hi @xinggangw @Unrealluver @ifzhang @LegendBC , Thanks for the release of the great work. But, I was trying to understand how I can actually interpret the model such that the results will be more explainable. Using the pretrained image classification weights, I just want to understand whether during training or inference, how can I record the attention weights generated within the BSSM at each time step. I have gone through the mamba_model.py where we have hidden states and residuals. Can I directly use them for understanding of the model predictions or do I need to extract other details using main.py or Mamba_model.py?If yes, what are the different directions that we can look into like saliency maps etc ,. Any suggestions would be highly useful. Thanks in advance.

hustvl / Vim

Hidden State understanding #32