HUANGLIZI / LViT

[IEEE Transactions on Medical Imaging/TMI] This repo is the official implementation of "LViT: Language meets Vision Transformer in Medical Image Segmentation"
MIT License
283 stars 26 forks source link

Inquiry About Generating CAM Images with LViT #45

Closed Studentpengyu closed 3 months ago

Studentpengyu commented 3 months ago

Dear Zihan,

I hope this message finds you well.

I have been following your work with great interest. Currently, I am attempting to replicate the interpretability Study in your paper However, I have encountered some challenges.

I am using the package 'pytorch_grad_cam' with the following code snippet: image image

'cam' requires the model's input to be a single tensor, whereas the LViT model requires two tensors (image, text) as input. I would like to ask how you generated the CAM figures in your work. Could you kindly provide some guidance or share the relevant code?

Thank you very much for your time and assistance.

Best regards, Pengyu Zhao

HUANGLIZI commented 3 months ago

You should change the source code of GradCAM. And make it adaptable for two inputs.

Studentpengyu commented 3 months ago

Thank you for your response. I have obtained the expected results for the LViT model and cited it correctly. I appreciate all your help along the way.