facebookresearch / detr

End-to-End Object Detection with Transformers
Apache License 2.0
13.47k stars 2.44k forks source link

Custom dataset training notebook ? #152

Open jeromen7 opened 4 years ago

jeromen7 commented 4 years ago

Hello,

I was wondering if someone managed to write a notebook for training DETR on a custom dataset. I saw the issue #9 but there is a lot of messages and nobody provided a complete solution for what I am looking for. The kaggle solution seems to work well but I don't know how to generalize it with multiple classes (and not only one for the wheat), and without the k-fold cross validation, which is great but adds a lot of computation time. Moreover, the README in this git says that the dataset should be in COCO format with a json annotation files, but the kaggle solution never uses json files ... So I don't know the expected format of a dataset in order to do a training with DETR on it... csv file ? json file ? xyxy or xywh format or both ? To sum up, I am looking for a simple and well-structured notebook that works on a dataset split into 2 folders (train and validation) like for example :

  1. Imports and installations
  2. Loading the model and configuration (setting values like num_queries, num_classes according to the object detection problem we want to solve)
  3. Training the model (with the path to the dataset, and an example of a small dataset would be awesome to see the format)
  4. Inference and eventually metrics generation to evaluate our model

I know I am asking a lot, but any little assistance will be very appreciated ! Thank you all very much for your help 😄

lessw2020 commented 4 years ago

Hi @jeromen7 - I've been working on one but work has delayed it. do you have a good example dataset in mind? That will help as I can't use my work ones and haven't found a good one yet.
Was hoping to find a covid mask detector or similar dataset or something pertinent. (one was proposed before for step detection but that dataset was not setup well as the maker had manually blended in flipped images making it impossible to seperate val from training images).

lessw2020 commented 4 years ago

Re: dataset format - I've been using coco format json with xyxy format (convert from coco format). Some others have had luck with cxcy. The detr codebase was built around coco json with a change to xyxy format for input - you can look at coco.py under /datasets to see how they did it. Note my half complete colab is here if that helps add some context for you: https://github.com/lessw2020/training-detr

woctezuma commented 4 years ago

do you have a good example dataset in mind?

Was hoping to find a covid mask detector or similar dataset or something pertinent.

What about one of these?

https://www.kaggle.com/andrewmvd/face-mask-detection

https://www.kaggle.com/mbkinaci/fruit-images-for-object-detection

https://www.kaggle.com/wobotintelligence/face-mask-detection-dataset

jeromen7 commented 4 years ago

Thank you for your answer @lessw2020 ! Now I know the appropriate format for my dataset, I will work on it. If you need a dataset for the notebook, maybe this one will help you ? https://drive.google.com/drive/folders/1XUR4ci88ABahff3TOxoT9GuxbjP7NwCq?usp=sharing

Thank you again for using some of your time to help us all ! 🙏

lessw2020 commented 4 years ago

Great thanks @woctezuma and @jeromen7 I think the covid mask one Jeromen posted would be easiest to work with as already in coco format (vs kaggles ones are csv)...you can use either but needs an extra translation step. @jeromen7 - I'm unclear though on this dataset if 'no mask' = persons face with no mask? There are some images of a building etc and it can't deal with an infinite no so just wanted to confirm what no mask class means. (maybe it's obvious once I preview with bounding boxes but I just looked at images). Anyway let me get my upcoming work demo ready by Fri, and then get this nice covid mask dataset up and running in a notebook to show.

jeromen7 commented 4 years ago

Yes you are right @lessw2020 , the 'no mask' class if for faces with no mask, and the 'mask' class is for faces with a mask.

bconsolvo commented 4 years ago

@lessw2020 Thank you for your work so far - I am looking forward to seeing the rest of the custom training dataset Colab notebook. I took the course on creating Coco Datasets (https://www.udemy.com/course/creating-coco-datasets/), by Adam Kelly. Now that I have my own image data and annotations in the Coco format, I am looking to start training with DETR.

I haven't found anybody else who has posted an example of training with DETR on a custom dataset, yet.

woctezuma commented 4 years ago

I haven't found anybody else who has posted an example of training with DETR on a custom dataset, yet.

You can finetune DETR either directly:

Or with the "detectron2" wrapper: