Closed IshitaB28 closed 1 month ago
Hi there, I am facing a similar issue. My model requires an input that consists of a list containing two tensors. How did you handle this? Could you share your solution?
Hi, so I handled it by combining the inputs into one tensor and then separating them out in the forward function of my model.
For example if you need to input 4 things into your model, you do:
inp = torch.cat((a, b, c, d), dim = 0) grayscale_cam = cam(input_tensor=inp, targets=None)
And then, in the forward function of your model:
def forward(self, inp): a, b, c, d = inp[0], inp[1], inp[2], inp[3]
before you proceed with further steps
Thank you for your prompt response. I will try it!
Hi, I have modified the input format by first combining the inputs into one tensor and then separating them out in the forward function of my model.
In my case, my two inputs are an image tensor and a text tensor. Since these two types have different sizes, direct concatenation is not possible. I flatten both and concatenate along the first dimension, then record the lengths and shapes for later separation. The models can work well in this way. However, there is still an issue.
Here is a snippet of my code:
inp = torch.cat((text_flat, image_flat), dim = 0) grayscale_cam = cam(input_tensor=inp, targets=None)
The grayscale_cam will be flattened too, and the size of it was Not what I expected. The size is the sum of the text and image tensors, while I expect it to be the size of the image tensor because I need to display grayscale_cam on the image.
Therefore, I extracted the image part, but the resulting image was completely incorrect.
Hi there, I wanted to let you know that my issue has been resolved.
Here is my solution: I moved the text features as fixed features into the forward function. Since my text features are extracted using a large language model and do not need updating, this approach works well.
Thank you for your help!
Hello, good to know that its solved now. Thanks for sharing the issue and the solution!
I am trying to use GradCam on my model that takes more than one input arguments. I tried to pass input_tensor instead of input_tensor since I have a list of 4 arguments. I was trying to modify the source code accordingly. I am facing a series of errors. I am at this point now: Traceback (most recent call last): File "/home/ishita-wicon/Documents/QA/ISIQA/UNET/expl_exp.py", line 449, in
grayscale_cam = cam(input_tensor=[left_patches, right_patches, left_image_patches, right_image_patches], targets=targets)
File "/home/ishita-wicon/.local/lib/python3.10/site-packages/pytorch_grad_cam/base_cam.py", line 192, in call
return self.forward(input_tensor,
File "/home/ishita-wicon/.local/lib/python3.10/site-packages/pytorch_grad_cam/base_cam.py", line 105, in forward
cam_per_layer = self.compute_cam_per_layer( input_tensor, #changes to * for list
TypeError: BaseCAM.compute_cam_per_layer() takes 4 positional arguments but 7 were given
Is there any other way?