Closed FrancescoSaverioZuppichini closed 4 years ago
Hi,
I believe you haven't trained for enough iterations to be able to see the model converge, specially because you are not using a pre-trained model but instead are training it from scratch, which requires a lot of iterations.
I would recommend following the fine-tuning steps in the tutorial that you pointed out, as you'll probably see better and faster results on limited data.
I'm closing the issue, but let us know if you have further questions/
Hi @fmassa,
Thanks :)
The tutorial was followed correctly. The loss is correctly decreasing during training and I see no problem at all. In this case, no output means no bbox. I will use the pre-trained weights and train the model for a longer time. In the meantime, could you be so kind to have a look at the code I have attached? Maybe I missed something. One last question, should I resize the image to normal imagenet format (224)?
Thank you
The code seems correct to me.
You don't need to resize the image to 224, just make sure your images are in 0-1 range in RGB, and the model will rescale them internally for you
The model is internally resizing the image and the bboxes to (480, 640, C), (COCO format), isn't it?
Using a pretrained network as follows:
from torchvision.models.detection.faster_rcnn import fasterrcnn_resnet50_fpn
from torchvision.models.detection.faster_rcnn import FastRCNNPredictor
model = fasterrcnn_resnet50_fpn(True).to(device)
num_classes = 1
in_features = model.roi_heads.box_predictor.cls_score.in_features
# replace the pre-trained head with a new one
model.roi_heads.box_predictor = FastRCNNPredictor(in_features, num_classes).to(device)
And trainig the network as showed in the first post I get the following output:
tensor(0.0620, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.1114, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0959, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0404, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0653, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0422, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0317, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0355, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0278, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0377, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0372, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0334, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0235, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0251, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0247, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0220, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0195, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0216, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0260, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0247, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0163, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0161, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0149, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0171, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0158, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0155, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0122, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0179, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0129, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0119, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0133, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0140, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0145, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0131, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0117, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0094, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0123, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0126, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0086, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0106, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0117, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0069, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0099, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0119, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0069, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0109, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0124, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0075, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0088, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0132, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0069, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0101, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0099, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0097, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0087, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0101, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0054, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0092, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0095, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0055, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0078, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0098, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0041, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0080, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0118, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0048, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0089, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0085, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0043, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0074, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0105, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0036, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0075, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0080, device='cuda:0', grad_fn=<AddBackward0>)
tensor(0.0049, device='cuda:0', grad_fn=<AddBackward0>)
but still,
model = model.eval()
with torch.no_grad():
model = model.cuda()
pred = model([ds[2][0].cuda()])
pred is still empty
[{'boxes': tensor([], size=(0, 4)),
'labels': tensor([], dtype=torch.int64),
'scores': tensor([])}]
Any idea?
On my side, I have rechecked the type of the inputs and they are correct. An example of one item in the dataset is:
(tensor([[[0.0549, 0.0549, 0.0549, ..., 0.1647, 0.1569, 0.1569],
[0.0549, 0.0549, 0.0549, ..., 0.1686, 0.1569, 0.1569],
[0.0549, 0.0549, 0.0549, ..., 0.1647, 0.1569, 0.1529],
...,
[0.0471, 0.0471, 0.0471, ..., 0.1490, 0.1490, 0.1490],
[0.0471, 0.0471, 0.0471, ..., 0.1490, 0.1490, 0.1490],
[0.0471, 0.0471, 0.0471, ..., 0.1490, 0.1490, 0.1490]],
[[0.0471, 0.0471, 0.0471, ..., 0.1255, 0.1176, 0.1176],
[0.0471, 0.0471, 0.0471, ..., 0.1294, 0.1176, 0.1176],
[0.0471, 0.0471, 0.0471, ..., 0.1255, 0.1176, 0.1137],
...,
[0.0235, 0.0235, 0.0235, ..., 0.1098, 0.1098, 0.1098],
[0.0235, 0.0235, 0.0235, ..., 0.1098, 0.1098, 0.1098],
[0.0235, 0.0235, 0.0235, ..., 0.1098, 0.1098, 0.1098]],
[[0.0510, 0.0510, 0.0510, ..., 0.1176, 0.1098, 0.1098],
[0.0510, 0.0510, 0.0510, ..., 0.1216, 0.1098, 0.1098],
[0.0510, 0.0510, 0.0510, ..., 0.1176, 0.1098, 0.1059],
...,
[0.0314, 0.0314, 0.0314, ..., 0.1059, 0.1059, 0.1059],
[0.0314, 0.0314, 0.0314, ..., 0.1059, 0.1059, 0.1059],
[0.0314, 0.0314, 0.0314, ..., 0.1059, 0.1059, 0.1059]]]),
{'boxes': tensor([[315.0003, 213.5002, 626.0004, 329.5002]]),
'labels': tensor([0]),
'image_id': tensor([1]),
'area': tensor([36503.9961]),
'iscrowd': tensor([0])})
I am not sure about iscrowd
, but in the tutorial, it was set to zero.
Thanks.
@FrancescoSaverioZuppichini I think I see the issue: the label for your object is 0, but Faster R-CNN considers value 0 as background. If you make the label be 1, it should work fine.
This is illustrated in the detection tutorial you mentioned, see the dataset line:
# there is only one class
labels = torch.ones((num_objs,), dtype=torch.int64)
But I agree it can be a bit tricky to spot this. I would happily accept a PR improving the documentation mentioning that the labels should start at 1 and that 0 is treated as background.
@fmassa Thank you, it works! 🥳🥳
I will definitely create a PR and improve the doc over the weekend
Cool, looking forward to the PR improving the documentation!
Hi @fmassa, I hope you are healthy. Sorry for the late reply but I have been very busy these days. Is there a doc contribution guide that I can follow to be sure I am changing the right files?
Hi @FrancescoSaverioZuppichini
All good here, hope everything is good for you as well.
You could maybe add some information in https://github.com/pytorch/vision/blob/master/docs/source/models.rst#object-detection-instance-segmentation-and-person-keypoint-detection or in the tutorials, which are hosted in https://github.com/pytorch/tutorials/blob/master/intermediate_source/torchvision_tutorial.rst
Hi @fmassa, I hope you are doing well. I have added a couple of sentences and hopefully, it is more understandable now
You can find the PR here https://github.com/pytorch/tutorials/pull/914
Thanks for the PR @FrancescoSaverioZuppichini !
Hi @FrancescoSaverioZuppichini @fmassa . I am also getting no predictions for faster-rcnn model. How did you resolve that problem, It was just changing by label index from 1 instead of 0.
By reading the above messages
There is still an error in the documentation. If you have 3 classes in your dataset, and you have no background class in your dataset, you have to specify that num_classes=4
instead of num_classes=3
. So, your labels would only contain 1, 2, and 3. However, you need to indicate that there is a non-existent class 0 by specifying the number of classes is equal to four.
If you don't, you will trigger an error: RuntimeError: CUDA error: device-side assert triggered
Hi, can someone help me with this too, I am trying object detection using faster rcnn and I used pre-trained model to fine tune it for my custom dataset. I've correctly labeled everything, eg : [1,2,3] for 3 classes + background which is 0, whenever i log the summed losses, they are always above 1e+25, and even when i use model.eval get the detection on the test set , i get no output other than this,
[{'boxes': tensor([], device='cuda:0', size=(0, 4)), 'labels': tensor([], device='cuda:0', dtype=torch.int64), 'scores': tensor([], device='cuda:0')}]
Hi, @jaaabir, probably your learning_rate parameter is too high. I solve that problem decrease lr in my optimizer
🐛 Bug
Dear all,
I am doing object detection in an image with one class. After training,
FastRCNNPredictor
does not return anything in validation mode. I have followed this official tutorial https://pytorch.org/tutorials/intermediate/torchvision_tutorial.html.Thanks.
To Reproduce
Steps to reproduce the behavior:
I have created a custom dataset, this is one of the output:
To prove its correctness I have also visualized the bbox on the image:
Then I create a
Dataloader
:Training works:
Output:
But, when I try to get a prediction I have no output:
pred
isThank you in advance
Expected behavior
The model should return a valid prediction.
Environment