facebookresearch / detr

End-to-End Object Detection with Transformers
Apache License 2.0
13.39k stars 2.41k forks source link

codes of data processing and detr model when pretraining detr with num_classes=250? #262

Closed Jacobew closed 3 years ago

Jacobew commented 3 years ago

Following the instructions in paper, we should pretrain a detr model with 250 classes where things and stuff are treated equally. But the DETR-R50 model for detection is trained with 91 classes, which is not suitable for us to load and freeze it in the DETRSegm model with 250 classes (unless we manually remove the weights of the classifier). Could you share the corresponding pretrain model trained with 250 classes and codes of data processing (that converts stuff segments to bounding boxes) for reproduction? Many thanks! :)

Jacobew commented 3 years ago

@alcinos @fmassa

alcinos commented 3 years ago

Hi @Jacobew

Since in the panoptic model the detection part (backbone + encoder/decoder) is frozen (and only the mask head is trained), you can extract the frozen detection part from the panoptic models we provide (simply delete or reset the mask head).

As for dataprocessing, everything is provided, see for eg: https://github.com/facebookresearch/detr/blob/4e1a9281bc5621dcd65f3438631de25e255c4269/datasets/coco_panoptic.py#L34

I believe I have answered your question, and as such I'm closing this. If you have further concerns, feel free to reach out.

Jacobew commented 3 years ago

Ouch, I see! Thanks for your quick answer!