How to train custom dataset

RainbowSun11Q2H commented 2 years ago

Thanks for your excellent work.

How can I try the model on my own dataset? I just change the num_classes in coco.py and the code can't go on. If you have a tutorial on how to train the custom dataset, it should be great.

Thanks for your time.

SlongLiu commented 2 years ago

See #23 for fine-tune details. You can ignore pretrained checkpoints if you want to train DINO on your custom datasets from scratch.

rocketsfallonrocketfalls commented 2 years ago

See #23 for fine-tune details. You can ignore pretrained checkpoints if you want to train DINO on your custom datasets from scratch.

I have gone through #23 and it was all I needed, everything worked fine.

AI-Passionner commented 2 years ago

Finally had the training own dataset working. The link of #23 seems not correct. I am not sure how it was working as setting dn_labelbook_size = number_class. In my case, I have 3 classes. When I set dn_labelbook_size = 3 and num_classes = 3' inDINO_4scale.py' , it didn't work. It threw out the following error message.

""" C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\IndexKernel.cu:91: block: [33,0,0], thread: [95,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. Traceback (most recent call last): File "main.py", line 397, in main(args) File "main.py", line 285, in main logger=(logger if args.save_log else None), ema_m=ema_m) File "C:\Users\xxx\PycharmProjects\DINO\engine.py", line 52, in train_one_epoch loss_dict = criterion(outputs, targets) File "C:\Users\xxx\anaconda3\envs\dino\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl return forward_call(*input, kwargs) File "C:\Users\xxx\PycharmProjects\DINO\models\dino\dino.py", line 499, in forward indices = self.matcher(outputs_without_aux, targets) File "C:\Users\xxx\anaconda3\envs\dino\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl return forward_call(*input, *kwargs) File "C:\Users\xxx\anaconda3\envs\dino\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context return func(args, kwargs) File "C:\Users\xxx\PycharmProjects\DINO\models\dino\matcher.py", line 81, in forward cost_class = pos_cost_class[:, tgt_ids] - neg_cost_class[:, tgt_ids] """ Then, I tried random numbers 7 and 11, and both of them were working. Then I realized that two parameters should be set as number_class + 1 since the 0 might be assigned to the background by default in DETR. Now it is working.

@SlongLiu, @SuperHenry2333 Maybe you guys can think about adding more details about how to train custom datasets. This will help more people try the DINO. By the way, I have reviewed three of your papers and all of them are very beautiful. This is why I finally came to DINO and tried it on my own dataset to see if it can beat others. However, it took me quite a while to make the training work.

SlongLiu commented 2 years ago

Thanks for pointing out the problem. We will correct them and update a manual for custom training later.

IDEA-Research / DINO

How to train custom dataset #81