facebookresearch / detr

End-to-End Object Detection with Transformers
Apache License 2.0
13.18k stars 2.39k forks source link

training fails on custom dataset #179

Closed aliamiri1380 closed 3 years ago

aliamiri1380 commented 3 years ago

Hi, I copied codes of the DETR and ran it after 50 epochs I got loss about 2 on a dataset with about 150 samples and all images are 800,800. as I said I copied everything from this repo but the interesting thing with this code is that: the output of the model is always constant if I input different images it'll output duplicate numbers and nothing change I was expecting that at least it overfit on the training dataset, not this behavior

https://colab.research.google.com/drive/1jR8rVC3ILKC5g6OucuMzReUmfROlVy4P?usp=sharing this is my Colab Notebook, outputs are printed at the bottom

lessw2020 commented 3 years ago

Hi, there is no access to your colab fyi. That said, something is clearly wrong in your setup to be getting same outputs regardless of image input. Ive trained with even low 100 images and had excellent results. If you post it with public access pls post again, otherwise not possible to help isolate the error.

aliamiri1380 commented 3 years ago

sorry about that here is the public link https://colab.research.google.com/drive/1jR8rVC3ILKC5g6OucuMzReUmfROlVy4P?usp=sharing

UnibsMatt commented 3 years ago

hi, i'm facing the same issue. I reproduce the architecture using the same transformer. In the first epochs the outputs of the trasformer are different, but after a few epochs the trasformer outputs became all the same. Reading other issue, they said that in order to obtain good performance the training data must be at least 10k-12k images. It seems impossible to make the architecture overfitting using a small training dataset. Also i try training with pythorch 1.4.0 without any error, but it seems that all the query of the transformer are "looking" at the same part of the images (the central one). Training from scratch with a small dataset (100 -300 images) produce good result?

aliamiri1380 commented 3 years ago

Yes, my code is predicting approximately the center of all images every time I run that. but it's still weird

alcinos commented 3 years ago

Hi @aliamiri1380

Your notebook includes a great deal of custom code and I don't have the time to review it in depth. Please start from a setup which is as close as possible to our training procedure, and experiment from there. As an example, you seem to be using Adam (self.optimizer = Adam(self.parameters())) without setting a learning rate at all which is not going to work. Use the recommended specific learning rates for the transformer and the backbone. Also, you seem to be using torchvision's resize transform directly (tv.transforms.Resize(size[0])). This is not going to work since you need to update the bounding boxes as well, not just the image. Please use the same transforms as we do as a starting point.

With 150 images, you will not be able to train from scratch. You need at least about 10K sample to hope for it to work. You need to finetune from a coco pre-trained model. For finetuning, there are a few resources available: https://gist.github.com/mlk1337/651297e28199b4bb7907fc413c49f58f https://www.kaggle.com/tanulsingh077/end-to-end-object-detection-with-transformers-detr

See also #9 and #125 I'm closing this since it's a duplicate, if you have troubles check the issues above and post there if something is still unclear. Good luck.

fuyunfei commented 3 years ago

Yes, my code is predicting approximately the center of all images every time I run that. but it's still weird

Hi, @aliamiri1380 Same wired thing happened to me too, always looking at the central , did you figure out how to solve it ?

UnibsMatt commented 3 years ago

As explained in other issues, detr works only with 10k + images. You can try to fine tune their model but from scratch it won't work on small dataset

Il giorno mer 30 set 2020 alle ore 08:40 Yunfei Fu notifications@github.com ha scritto:

Yes, my code is predicting approximately the center of all images every time I run that. but it's still weird

Hi, @aliamiri1380 https://github.com/aliamiri1380 Same wired thing happened to me too, always looking at the central , did you figure out how to solve it ?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/facebookresearch/detr/issues/179#issuecomment-701190905, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGFS5L4U53LZIVOBXNKPCG3SILHGTANCNFSM4PUYA47Q .