Training/ finetuning Deformable-DETR on custom dataset?

fundamentalvision / Deformable-DETR

Deformable DETR: Deformable Transformers for End-to-End Object Detection.

Apache License 2.0

3.2k stars 517 forks source link

Training/ finetuning Deformable-DETR on custom dataset? #49

Open ahmed-nady opened 3 years ago

ahmed-nady commented 3 years ago

Thanks for sharing the Deformable -DETR code. Can you clarify recommendations for training on a custom dataset? Should we build a model from scratch, or better to use and fine-tune a full coco pretrained model and adjust the linear layer to desired class count?

lucastononrodrigues commented 3 years ago

For DETR in general the adjustment in the linear layer works fine, it might work as well for deformable DETR. I don't see any reason why it would not generalize to deformable DETR. If there is any difference I would be very interested in knowing it.

ahmed-nady commented 3 years ago

Thanks for your note.

buncybunny commented 3 years ago

@ahmed-nady @lucastononrodrigues

Hey there. Thank you for the discussion. Is the linear layer you guys are talking about is the one below, which is the final fc layer in the prediction head? https://github.com/fundamentalvision/Deformable-DETR/blob/11169a60c33333af00a4849f1808023eba96a931/models/deformable_detr.py#L54

Or should I adjust the other linear layers such as nn.Linear(hidden_dim, hidden_dim) in the prediction head?

aungpaing98 commented 2 years ago

Is there any final conclusion, I am also looking to fine tune on custom dataset.

Is it also possible to freeze the previous layers and fine tune on only last 3 layers. I don't see the code for freeze the model. But only for the mask (segmentation) model.

JacobBITLABS commented 1 year ago

I just needed this and did almost the same as in DETR. Deleting the prediction-head (FC-layers) from the resumed model. Using the original code-base a model I constructed to my needs (number of classes) and the resumed model's parameters is copied over (no changes). If the number of classes is different, this needs to be handled. I simply omit the transfer of this, which is not necessarily the smartest trick.