lyuwenyu / RT-DETR

[CVPR 2024] Official RT-DETR (RTDETR paddle pytorch), Real-Time DEtection TRansformer, DETRs Beat YOLOs on Real-time Object Detection. 🔥 🔥 🔥
Apache License 2.0
1.64k stars 178 forks source link

categories of pretrained weight problem #305

Open jeonjj1 opened 1 month ago

jeonjj1 commented 1 month ago

I am planning to use a pretrained weight file to fine-tune my custom dataset with transfer learning. My dataset has 5 categories, whereas I anticipate that the rt-detr weight file is designed for 80 categories. Therefore, I need to modify the weight file to accommodate only 5 categories, but I am not sure how to open the weight file. I would appreciate your assistance with this.

Additionally, I am curious to know if it is necessary to modify the weight file due to the difference in the number of categories.

lyuwenyu commented 1 month ago

You don't need to do this yourself. By default, mismatched weights will be discarded.

https://github.com/lyuwenyu/RT-DETR/blob/main/rtdetr_pytorch/src/solver/solver.py#L142

jeonjj1 commented 1 month ago

Thank you. However, I am curious if not modifying the pretrained weights will result in longer computation times.

Also, since there are only 5 categories, I wonder if I need to change all instances of num_classes = 80 to 5 in the code, and modify the category items as well. (I want to minimize computation as much as possible.)

lyuwenyu commented 1 month ago

Also, since there are only 5 categories, I wonder if I need to change all instances of num_classes = 80 to 5 in the code, and modify the category items as well. (I want to minimize computation as much as possible.)

Yes, you need to do so

jeonjj1 commented 1 month ago

Thank you for your response. I am wondering if there is a way to further reduce the model size. I am looking for methods that can significantly improve the speed, even if it means a slight decrease in accuracy.

lyuwenyu commented 1 month ago

You can try to use quantization technology. ( float32 -> float16 and float32 -> int8


ps. rtdetrv2 introduces a discrete deformable attention method to speedup inference. https://github.com/lyuwenyu/RT-DETR/issues/179

Caohz678 commented 2 days ago

Hello, I’d like to ask where exactly the code for the CCFF module is located. I haven’t been able to find it. Thank you very much!