Open Harry-Rogers opened 2 years ago
Hi,
Need more details.
What is the target you're using ?
Is it exactly like FasterRCNNBoxScoreTarget
from the notebook, or something else ?
I'm using the Yolov5 model so just I'm just using the below code from the tutorial. I managed to get ScoreCAM working for a Faster RCNN with the same dataset so I don't think its that.
target_layers = [model.model.model.model[-2]]
Thanks, sorry for the delay in the response. Are you using FasterRCNNBoxScoreTarget as the target (not target_layers)? I suspect there is a problem there, so that's why I'm asking. In case you modified the target (the function that outputs a score), can you please paste the code here?
Hi, I have been using a YOLOv5s model, I have adapted the YOLOv5 notebook below. I still get the same error mentioned above.
import warnings
warnings.filterwarnings('ignore')
warnings.simplefilter('ignore')
import torch
import cv2
import numpy as np
import requests
import torchvision.transforms as transforms
from pytorch_grad_cam import ScoreCAM
from pytorch_grad_cam.utils.image import show_cam_on_image, scale_cam_image
from PIL import Image
COLORS = np.random.uniform(0, 255, size=(80, 3))
def parse_detections(results):
detections = results.pandas().xyxy[0]
detections = detections.to_dict()
boxes, colors, names = [], [], []
for i in range(len(detections["xmin"])):
confidence = detections["confidence"][i]
if confidence < 0.2:
continue
xmin = int(detections["xmin"][i])
ymin = int(detections["ymin"][i])
xmax = int(detections["xmax"][i])
ymax = int(detections["ymax"][i])
name = detections["name"][i]
category = int(detections["class"][i])
color = COLORS[category]
boxes.append((xmin, ymin, xmax, ymax))
colors.append(color)
names.append(name)
return boxes, colors, names
def draw_detections(boxes, colors, names, img):
for box, color, name in zip(boxes, colors, names):
xmin, ymin, xmax, ymax = box
cv2.rectangle(
img,
(xmin, ymin),
(xmax, ymax),
color,
2)
cv2.putText(img, name, (xmin, ymin - 5),
cv2.FONT_HERSHEY_SIMPLEX, 0.8, color, 2,
lineType=cv2.LINE_AA)
return img
image_url = "https://upload.wikimedia.org/wikipedia/commons/f/f1/Puppies_%284984818141%29.jpg"
img = np.array(Image.open("Puppies_(4984818141).jpg"))
img = cv2.resize(img, (640, 640))
rgb_img = img.copy()
img = np.float32(img) / 255
transform = transforms.ToTensor()
tensor = transform(img).unsqueeze(0)
model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True)
model.eval()
model.cpu()
target_layers = [model.model.model.model[-2]]
results = model([rgb_img])
boxes, colors, names = parse_detections(results)
detections = draw_detections(boxes, colors, names, rgb_img.copy())
Image.fromarray(detections)
cam = ScoreCAM(model, target_layers, use_cuda=False)
grayscale_cam = cam(tensor)[0, :, :]
cam_image = show_cam_on_image(img, grayscale_cam, use_rgb=True)
Image.fromarray(cam_image)
Oh ok, now I got it.
The example in the YOLO notebook uses EigenCAM, it's a method that doesn't require a "target". The target is what guides the models selecting which channels are important by a score. In the FasterRCNN notebook there a target function, for AblationCAM, that checks how the predicted box in the modified image overlap in IOU/category with the original boxes. EigenCAM doesn't need this, but the ScoreCAM method does.
So will need to rewrite FasterRCNNBoxScoreTarget for YOLO (since the model outputs the boxes in a different format).
I can try doing that
Oh ok thank you for clearing that up.
If that's possible that would be great.
First I want to thank jacobgil for your brilliant works, especially for those tutorials, they are very helpful, even more useful than tutorials provided in captum.ai.
Anyway, I've been trying to get pytorch-grad-cam to output cam image for specific labels and wrote ScoreTarget class for yolo. I try to get ablationCam working for yolov5, but after some tinkering, things got stuck.
My understanding is that AblationCam replace the target layer I provided (like target_layers = [model.model.model.model[-2]]
) with the albation layer. but after this, yolo v5 reported this error:
AttributeError: 'AblationLayerYolo' object has no attribute 'f'
So my question is do I need to implement this f thing myself? cause from what I saw, ablation layer should have .set_next_batch and call, and this f thing seem to be something native to yolov5, but since the layer replacement occurs, I also need to address it.
By the way, maybe score-cam can be adopted for yolo-v5 more easily? cause from what I see there is no layer replacement there.
I can try doing that
did u able to implement that?
Similar error here!
Hi I have pretrained a YOLOv5 model on a custom dataset and have tried to use the tutorial code to use ScoreCAM but seem to get the below error.
ValueError: only one element tensors can be converted to Python scalars
Which points to line 59 in score_cam.py (below).
outputs = [target(o).cpu().item() for o in self.model(batch)]
I'm unsure of how to fix this as the batch is a tensor that is the same shape as my other implementation using ScoreCAM with a Faster RCNN network.
Any help would be greatly appreciated.