hasanirtiza / Pedestron

[Pedestron] Generalizable Pedestrian Detection: The Elephant In The Room. @ CVPR2021
https://openaccess.thecvf.com/content/CVPR2021/papers/Hasan_Generalizable_Pedestrian_Detection_The_Elephant_in_the_Room_CVPR_2021_paper.pdf
Apache License 2.0
687 stars 159 forks source link

Test-time Loss for (Demo) Images #122

Closed cpauling closed 2 years ago

cpauling commented 2 years ago

Hi,

I'm working on a project involving adversarial attacks on object detectors, and using the models you have provided as examples. In order to implement the methods I am using, I need the loss of the predicted bboxes (and potentially other loss functions). Could you possibly explain how I could output a pre-trained model's loss, given an image at test-time (such as one of the demo images)?

I have had a look at the source code, and found some functions that seem to take the predicted and ground truth bboxes as input, but cannot figure out how to input the results from running the demo into these functions, in order to calculate the loss. Also, do you have the ground truth annotations for the demo images?

Thank you.

hasanirtiza commented 2 years ago

For starters, you need to find the corresponding ground truth files for the images in demo (They are taken from Caltech, CityPersons and ECP). Which should be fun. Subsequently, look at the forward pass function and you can compute the loss over there. Hope that helps.

cpauling commented 2 years ago

Thank you for your reply. I have managed to find and download the ground truth datasets. Since working on this, I have realised it is actually necessary to compute the gradient w.r.t the input image, similar to the code shown here:

data.requires_grad = True
output = model(data)
loss = F.smooth_l1_loss(output, target)
model.zero_grad()
loss.backward()
data_grad = data.grad.data

Where data in this case would be the input image. I see how I could possibly get the loss using the code you suggested, however, is it possible to add the data/image to the PyTorch graph in order to be able to compute it's gradient? Currently it is returning a NoneType object.

In short, I need to calculate a tensor of gradients that is the same dimensions as the input image, would this be possible?