Ambiguity in GradCam Visualization with Grayscale Images

DariiaKhoroshchuk commented 7 months ago

Hi, I trained ResNet18 on one-channel X-ray scans and encountered an issue when using GradCam for visualization. The confusion arises when deciding whether to set use_rgb as True or False. Despite converting the image to RGB or BGR format, both settings yield the same results when use_rgb is set to False. However, when setting use_rgb as True, the results differ from the previous case but are similar to each other after converting the image to RGB or BGR format. The original image is grayscale, but for visualization purposes, I duplicated it to create a 3-channel image. So it makes sense, but I have no idea how to identify what I should use True or False.

The code for my model looks like this:

def get_custom_resnet18():
    model = models.resnet18(pretrained=False)
    # Input: single-channel (grayscale) image of size 1024x1024
    model.conv1 = nn.Conv2d(1, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
    # Binary classification
    num_ftrs = model.fc.in_features
    model.fc = nn.Linear(num_ftrs, 2)

    return model

To use GradCam I should reshape it to have w, h, 3: Original image shape: torch.Size([1, 2048, 2048]) Then I changed it to torch.Size([2048, 2048, 1]) using img = np.transpose(img, (1, 2, 0)) and further to (2048, 2048, 3) using img = cv2.cvtColor(np.array(img), cv2.COLOR_GRAY2RGB)

The show_cam_on_image function takes the following shapes:

Image shape: (2048, 2048, 3)
Mask shape: (2048, 2048)
Heatmap shape: (2048, 2048, 3)

jacobgil commented 6 months ago

Hi, use_rgb should be True if the input image is in RGB, and false if in BGR. In your case the image is both RGB and BGR (because the the grayscale channel is duplicated).

show_cam_on_image just needs to know if the colors it's going to draw should be in RGB or BGR.

In your case both are fine, and I would just use use_rgb=False, and just be aware that the result is in RGB format when drawing the image or when saving to disk.

DariiaKhoroshchuk commented 6 months ago

Hi, Thank you for your answer; I appreciate it. Can you please explain in more detail what you mean by: "Just be aware that the result is in RGB format when drawing the image or when saving to disk"? Should I save it in some specific way, or are the colors just switched?

jacobgil / pytorch-grad-cam

Ambiguity in GradCam Visualization with Grayscale Images #469