Open theoutsider8060 opened 4 years ago
Hi @theoutsider8060 Thank you for your interest in DETR
I suggest you to have a look at these issues: #9, #124 and #125 to see if you can gain some insights.
Regarding your specific case, some remarks/questions:
Best of luck.
Thanks for the reply @alcinos . I will definitely look more into the issues that you mentioned.
With regards to results using other models, at the moment, I am getting AP of 40.2 using faster R-CNN with an inference time of 91 ms. By using DETR, my inference time per image has dropped to around 70 ms but the AP has also dropped as I mentioned.
Just a few queries before I make the change:
@theoutsider8060 Hello, I am now facing the same case, as I am using detectron2 wrapper, how do you change the class_nums to 1 as I did not see it in the yaml files ?
@alcinos I have a question about images transformation, in my case, all the images are having the same shape (1080,1440), do I still need to reshape it? And how could I change input part in yaml files like MIN_SIZE_TRAIN: (480, 512, 544, 576, 608, 640, 672, 704, 736, 768, 800), it means that images need to be reshaped into a square shape but not rectangular?
Thanks in advance !
Hi @Antoine-ls, the second part of your question has been answered by @fmassa in https://github.com/facebookresearch/detr/issues/245#issuecomment-699491315
You should resize the input to have a minimum size of 800 (and ideally a max size of 1333), the model doesn't do the resizing inside it. Check our colab notebook for further details on how to perform the input preprocessing https://colab.research.google.com/github/facebookresearch/detr/blob/colab/notebooks/detr_attention.ipynb
@Antoine-ls
class_nums
Image will still be rectangular as MIN_SIZE_TRAIN
is used along with MAX_SIZE_TRAIN (default to 1333)
You can this how these params are used in the function:
class ResizeShortestEdge(Augmentation):
"""
Scale the shorter edge to the given size, with a limit of `max_size` on the longer edge.
If `max_size` is reached, then downscale so that the longer edge does not exceed max_size.
"""
Hi @theoutsider8060 Any luck with your DETR training? I am in similar situation as you did. Did you manage to get a similar performance compared with mask rcnn?
Regards, Ruoding
Hi, thank you for this great repo on DETR. I was trying to fine-tune DETR on my custom dataset using the detectron2 wrapper and I need some suggestions regarding that.
My dataset consists of around 14k training images and 5k validation images with a maximum of 30 objects per image. There is only a single class.
These are the changes I made to the default configuration:
WARNING [07/31 15:27:46 fvcore.common.checkpoint]: Skip loading parameter 'detr.class_embed.weight' to the model due to incompatible shapes: (81, 256) in the checkpoint but (2, 256) in the model! You might want to double check if this is expected. WARNING [07/31 15:27:46 fvcore.common.checkpoint]: Skip loading parameter 'detr.class_embed.bias' to the model due to incompatible shapes: (81,) in the checkpoint but (2,) in the model! You might want to double check if this is expected. [07/31 15:27:46 fvcore.common.checkpoint]: Some model parameters or buffers are not found in the checkpoint: detr.class_embed.{weight, bias} criterion.empty_weight
Now, it seems my maP is stuck around 35-38 and not improving much. And, the training curve has plateaued as well. What can I do to make my model perform better? These are some questions I have off the top of my head:
I tried changing num_queries to 60 and trained for 50 epochs and my maP only became 6. Should I pursue this further? I do not have much computational resource to train for 300+ epochs.
Change the eos_coef? Will that help?
I am currently using the default augmentation. Should I add some extra augmentation to better generalize my dataset?
Is there something specific I need to change because I have only a single class? The loss values are still pretty high around 20-30 after 70 epoch.
Any help would be greatly appreciated. Thanks in advance.