Use the distillation enhancement technique on my data via this code

tinyvision / DAMO-YOLO

DAMO-YOLO: a fast and accurate object detection method with some new techs, including NAS backbones, efficient RepGFPN, ZeroHead, AlignedOTA, and distillation enhancement.

Apache License 2.0

3.75k stars 470 forks source link

Use the distillation enhancement technique on my data via this code #95

Closed duchieuphan2k1 closed 1 year ago

duchieuphan2k1 commented 1 year ago

Before Asking

[X] I have read the README carefully. 我已经仔细阅读了README上的操作指引。
[X] I want to train my custom dataset, and I have read the tutorials for finetune on your data carefully and organize my dataset correctly; 我想训练自定义数据集，我已经仔细阅读了训练自定义数据的教程，以及按照正确的目录结构存放数据集。
[X] I have pulled the latest code of main branch to run again and the problem still existed. 我已经拉取了主分支上最新的代码，重新运行之后，问题仍不能解决。

Search before asking

[X] I have searched the DAMO-YOLO issues and found no similar questions.

Question

You said that you use S as teacher to distill T, and M as teacher to distill S, while M is distilled by it self. So is there any way for me to perform this technique on my data

Additional

No response

XianzheXu commented 1 year ago

To use distillation in your data, you need to train a teacher model on your data first, then use the teacher model to supervise the training of student model.

Firstly, chose a bigger model as the teacher, e.g. damoyolo-m, and train it from scratch or finetune it from coco pretrained weights on your data.

Secondly, use the pretrained teacher model to conduct a distillation to your target(student) model, e.g. damoyolo-s. For detailed usage of distillation, please refer to ./scripts/coco_distill.sh.

duchieuphan2k1 commented 1 year ago

I have used the biggest model yet (DAMO-YOLO-M). So how could i select the teacher for this model, or the distillation technique cannot apply for the biggest model?

XianzheXu commented 1 year ago

You can use DAMO-YOLO-M as a teacher model to perform self-distillation on itself, via

python -m torch.distributed.launch --nproc_per_node=8 tools/train.py -f configs/damoyolo_tinynasL35_M.py --tea_config configs/damoyolo_tinynasL35_M_tea.py --tea_ckpt ../damoyolo_tinynasL35_M.pth

We believe this distillation method is effective for biggest model. However, as DAMO-YOLO-M is currently the largest model we offer, we can only conduct self-distillation on it. We plan to introduce larger models in the future, which can be used for distilling DAMO-YOLO-M at that time.

duchieuphan2k1 commented 1 year ago

Could i clone the configs file damoyolo_tinynasL35_M_tea.py from damoyolo_tinynasL35_M.py. Or is there any damoyolo_tinynasL35_M_tea.py file in other folder? Cause i don't see the damoyolo_tinynasL35_M_tea.py file in ./config forder.

XianzheXu commented 1 year ago

Sorry for the misleading, the damoyolo_tinynasL35_M_tea.py is a copy of damoyolo_tinynasL35_M.py. The ''tea'' suffix is used to distinguish the work folder, as we save the checkpoints into "./workdirs/config_file_name/" by default.