Open Mairafatoretto opened 1 year ago
No, we just released the code for the main functionalities: pre-processing, training, and testing
Hello Kevin. Could you explain to me, how did you make figure 4? Because in your model you have two GA., then for me its not clear how did you make the attention score for each patch.
Hi, basically you can modify the forward pass of the model so that it also returns the attention scores A_1_aux
as well as the indices of the selected patches. Then you can infer the corresponding patch coordinates so that they can be highlighted in the WSI.
It's not super straightforward. I will try to provide a script for this in the coming days, but it may take some time.
Hmm, I understood the idea Kevin, but what would be the variable with the indexes of the patches? And how will I know the coordinates, if they are not given in the model, only in the preprocess?
For the indices you can use, e.g., select_1
. A preprocessed .h5 file should also contain the fields 1.25x_coords
, 2.5x_coords
etc. from which you can retrieve the patch coordinates.
But you only plot the attention for the top k ? Then the select_1 has the same order than preprocessed .h5?
You're right, what I described is for plotting only the top k. If you want to plot all attention values, you don't need the selected coordinates. You can use the attention A_1_aux
and the coordinates from the .h5 file. The order should match.
When you use the actual WSI to overlay it with the attention scores, it's important to load it at the right magnification and pad it in the same way as during preprocessing (similarly as here)
I understood this part Kevin, However the output of the function pad_image_with_factor function is an np array , I'm having trouble getting back to the original image format to plot the heatmap, in my case svs.
If it's just for visualization, you don't have to go back to .svs. You can use Image.fromarray()
from the PIL library to convert the numpy array to an image that you can save.
Hi Kevin, I'm sorry but it's still not clear to me what you are saving in the coordinates. Are they columns and rows? Are they x and y? if they are columns and rows how do I know the total number of columns and rows each image has?
Hi, the coordinates represent the column and row indices of the patches. If you multiply the indices with the patch size (256), you get the absolute coordinates y and x. What do you need the total number of columns/rows for?
Hi Kevin, strange because for the second dimension I have 4 values for each patch, the last two of which are binary only (0,1).
Another strange point is that this multiplication of 256 works only for 20x dimension images and not for 40x dimension. I'm trying to use deepZoom's get_tile_coodenates to get back the initial coordinates. But apparently it doesn't get the exact coordinate either.
Hi Kevin, strange because for the second dimension I have 4 values for each patch, the last two of which are binary only (0,1).
Another strange point is that this multiplication of 256 works only for 20x dimension images and not for 40x dimension. I'm trying to use deepZoom's get_tile_coodenates to get back the initial coordinates. But apparently it doesn't get the exact coordinate either.
Kevin, when you extract the highest magnitude attention scores, the program is doing a torch.einsum. This makes my score vector very small, different from the number of patches and the same size for all images. What is the purpose of making this transformation to x2?
This is the actual patch selection process (formulated as matrix multiplication, for which we use torch.einsum()
)
Did you make the programs available to make the attention maps? I would like to better understand how it was generated