Ajaz-Ahmad commented 2 years ago

Hi,

I saw your comment on TransMIL repository Link: https://github.com/szc19990412/TransMIL/issues/2 I didn't had any other option to reach to you so I am pinging you here. I have very similar question, I am trying to generate the attention map. The attention matrix shape I have is as below: (1, 8, 5281,5281) where number of tiles in my slide has 5281. Now My question is what does 8 (number of heads) indicates and how to utilize that to plot heatmaps Do I need to merge the score with following step: (1, 5281,8,5281) then (1,5281,42248) Thanks in advance.

gabemarx commented 2 years ago

Hi Ajaz,

Sorry for the late reply, I have been in medical school and busy with clinical work.

I probably need a bit more info to best answer your question but it does not make full sense that the number of tiles you have is 5281, you should double check that that is indeed the case. The way this pipeline works is reshaping your slide into a hypercube which requires a certain level of padding. In addition to that, there is also a class token which is added to the stack. The attention map you want is the attention values from this class token.

The number of heads is the number of independent attention modules that are run simultaneously. You technically should have 8 unique attention heatmaps for each attention head.

Ajaz-Ahmad commented 2 years ago

Hi Gabe,

Thanks for your reply. I am adding more information regarding my feature creation step. I have a slide of shape 42x44 = 1848 tiles. Each tile of size 512x512. I feed these tiles to resnet50 architecture and get 1024 feature vectors or 1848x1024 features which I feed to transMIL. The training goes well and test results are also good. Now, I want to visualize the attention values. And I could not get how to do that.

B x n_heads x (n + add_length + padding + 1) x (n + add_length + padding + 1)

1 x 8 x (1848 + 1 + 198 +1 ) X (1848 + 1 + 198 + 1)

Now I have 1x8x1849x1849 matrix. What is the next step. The first row can be class token I believe. Thanks

gabemarx commented 2 years ago

So to extract the attention values for each tile for attention head a you would index [0,a,0,:1848]

NewSun55 commented 1 year ago

So to extract the attention values for each tile for attention head a you would index [0,a,0,:1848]

Your answer is a good solution to my question, but for the attention score, I would like to ask you if single head visualization works better or multiple head averaging?

Rokinluohhh commented 9 months ago

Hi,can you provide your code about attention visualization?thanks a lot!

gabemarx / BMI2002

Suggestion on TransMIL #1

B x n_heads x (n + add_length + padding + 1) x (n + add_length + padding + 1)

1 x 8 x (1848 + 1 + 198 +1 ) X (1848 + 1 + 198 + 1)