Training on a custom dataset

Hananali1 commented 2 years ago

Thanks for sharing your great code!. I was trying to train your semi-supervised model on a custom data. yet I always get unsup_loss_rpn_bbox: 0.0000, unsup_loss_bbox: 0.0000 even after a long training time. My data has only one object class. Any suggestions, please? Thanks

This is what I got on test set. It looks that the network was never trained

[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 3276/3276, 45.6 task/s, elapsed: 72s, ETA: 0s Evaluating bbox... Loading and preparing results... DONE (t=0.00s) creating index... index created! Running per image evaluation... Evaluate annotation type *bbox* DONE (t=0.36s). Accumulating evaluation results... DONE (t=0.07s). Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.007 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=1000 ] = 0.010 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=1000 ] = 0.010 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = 0.000 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = 0.007 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.000 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.001 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=300 ] = 0.001 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=1000 ] = 0.001 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = 0.000 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = 0.001 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.000 OrderedDict([('bbox_mAP', 0.007), ('bbox_mAP_50', 0.01), ('bbox_mAP_75', 0.01), ('bbox_mAP_s', 0.0), ('bbox_mAP_m', 0.007), ('bbox_mAP_l', 0.0), ('bbox_mAP_copypaste', '0.007 0.010 0.010 0.000 0.007 0.000')])

MendelXu commented 2 years ago

I think you can try to train a supervised model first. If it is ok, load the weight and train the network in a semi-supervised maner.
At the same time, adjusting the pseudo label threshold according to the distribution of the predictions.

Hananali1 commented 2 years ago

I think you can try to train a supervised model first. If it is ok, load the weight and train the network in a semi-supervised maner. At the same time, adjusting the pseudo label threshold according to the distribution of the predictions.

@MendelXu Thanks for your reply. Even when I trained the supervised model, I got a similar issue (shown below). Not sure why the models weren't able to be trained? Is it a data-related issue? Or something related to the weight initialisation, parameters, ...etc.? Could you please advise with this?

[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 3276/3276, 45.4 task/s, elapsed: 72s, ETA: 0s Evaluating bbox... Loading and preparing results... DONE (t=0.00s) creating index... index created! Running per image evaluation... Evaluate annotation type *bbox* DONE (t=0.40s). Accumulating evaluation results... DONE (t=0.08s). Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.020 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=1000 ] = 0.082 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=1000 ] = 0.002 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = 0.001 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = 0.023 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.002 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.030 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=300 ] = 0.030 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=1000 ] = 0.030 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = 0.002 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = 0.034 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.005 OrderedDict([('bbox_mAP', 0.02), ('bbox_mAP_50', 0.082), ('bbox_mAP_75', 0.002), ('bbox_mAP_s', 0.001), ('bbox_mAP_m', 0.023), ('bbox_mAP_l', 0.002), ('bbox_mAP_copypaste', '0.020 0.082 0.002 0.001 0.023 0.002')])

Hananali1 commented 2 years ago

Solved! Thanks

wjm202 commented 2 years ago

When I run the code on coco, I get following statistics during training

2022-02-24 10:17:56,029 - mmdet.ssod - INFO - Iter [112050/180000] lr: 1.000e-02, eta: 81 days, 16:16:19, time: 100.132, data_time: 98.562, memory: 7480, ema_momentum: 0.9990, sup_loss_rpn_cls: 0.0827, sup_loss_rpn_bbox: 0.0696, sup_loss_cls: 0.3059, sup_acc: 91.7238, sup_loss_bbox: 0.2549, unsup_loss_rpn_cls: 0.0817, unsup_loss_rpn_bbox: 0.0868, unsup_loss_cls: 0.4136, unsup_acc: 96.2484, unsup_loss_bbox: 0.3910, loss: 1.6862

But when I run on my own dataset, I dont get supervised stats, i just get unsupervised, and accuracy 100%, can you please explain.

2022-02-24 10:41:18,992 - mmdet.ssod - INFO - Iter [800/2000] lr: 1.000e-02, eta: 0:12:37, time: 0.633, data_time: 0.018, memory: 2491, ema_momentum: 0.9988, unsup_loss_rpn_cls: 0.0000, unsup_loss_rpn_bbox: 0.0000, unsup_loss_cls: 0.0000, unsup_acc: 100.0000, unsup_loss_bbox: 0.0000, loss: 0.0000 i have the same question please tell me how to solve?

Hananali1 commented 2 years ago

@wjm202 Hi, I can tell you that this issue has been produced due to imprecise data preparation. Your json files don't meet the data annotations of COCO. There is a leak in your annotations. If you tried the supervised model, you will notice that even the supervised model will not be able to train on your data. Double-check your data annotations.

wjm202 commented 2 years ago

ok!thank you very much！

---Original--- From: @.> Date: Fri, Mar 4, 2022 01:53 AM To: @.>; Cc: @.**@.>; Subject: Re: [microsoft/SoftTeacher] Training on a custom dataset (Issue #163)

@wjm202 Hi, I can tell you that this issue has been produced due to imprecise data preparation. Your json files don't meet the data annotations of COCO. There is a leak in your annotations. If you tried the supervised model, you will notice that even the supervised model will not be able to train on your data. Double-check your data annotations.

— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you were mentioned.Message ID: @.***>

wjm202 commented 2 years ago

I can tell you that this issue has been produced due to imprecise data preparation. Your json files don't meet the data annotations of COCO. There is a leak in your annotations. If you tried the supervised model, you will notice that even the supervised model will not be able to train on your data. Double-check your data annotations. my supervised model can be trained,but I always have this problem,even I train by coco dataset

joeyslv commented 2 years ago

Have solved it yet？

joeyslv commented 2 years ago

Have solved it yet？

Hananali1 commented 2 years ago

@joeyslv Double-check your data preparation. In my case, JSON files didn't meet the data annotations of COCO. After fixing the issue in my custom data, the issue solved

joeyslv commented 2 years ago

@joeyslv Double-check your data preparation. In my case, JSON files didn't meet the data annotations of COCO. After fixing the issue in my custom data, the issue solved

Can you send me a small dataset that you can run? I've changed the dataset many times, but it's not working . thank you.

Hananali1 commented 2 years ago

{ "info": { "description": "description pf your data", "url": "None", "version": "x.0", "year": 20xx, "contributor": "Your name", "date_created": "20xx/xx/xx" }, "images": [ { "height": xxx, "width": xxx, "id": 1100, "file_name": "xxx.png" }, { "height": xxx, "width": xxx, "id": 1101, "file_name": "xxx.png" }, .... { "height": xxx, "width": xxx, "id": 1200, "file_name": "xxx.png" } ], "categories": [ { "supercategory": "none", "id": 0, "name": "None" }, { "supercategory": "None", "id": 1, "name": "name of class1" }, { "supercategory": "None", "id": 2, "name": "name of class2" } ], "annotations": [ { "segmentation": [], "iscrowd": 0, "area": xxx, "image_id": 1100, "bbox": [ xx, xx, xx, xx ], "category_id": 1, "id": 0 }, { "segmentation": [], "iscrowd": 0, "area": xxx, "image_id": 1101, "bbox": [ xx, xx, xx, xx ], "category_id": 1, "id": 1 }, ..... ] } },

  The annotation of data should be represented like this in the JSON file. @joeyslv

ُ

joeyslv commented 2 years ago

{ "info": { "description": "description pf your data", "url": "None", "version": "x.0", "year": 20xx, "contributor": "Your name", "date_created": "20xx/xx/xx" }, "images": [ { "height": xxx, "width": xxx, "id": 1100, "file_name": "xxx.png" }, { "height": xxx, "width": xxx, "id": 1101, "file_name": "xxx.png" }, .... { "height": xxx, "width": xxx, "id": 1200, "file_name": "xxx.png" } ], "categories": [ { "supercategory": "none", "id": 0, "name": "None" }, { "supercategory": "None", "id": 1, "name": "name of class1" }, { "supercategory": "None", "id": 2, "name": "name of class2" } ], "annotations": [ { "segmentation": [], "iscrowd": 0, "area": xxx, "image_id": 1100, "bbox": [ xx, xx, xx, xx ], "category_id": 1, "id": 0 }, { "segmentation": [], "iscrowd": 0, "area": xxx, "image_id": 1101, "bbox": [ xx, xx, xx, xx ], "category_id": 1, "id": 1 }, ..... ] } },
  The annotation of data should be represented like this in the JSON file. @joeyslv  
ُاث I understand ，Thank you so much for your help.

wjm202 commented 2 years ago

thanks very much! it was solved!

---Original--- From: @.> Date: Sat, Apr 30, 2022 12:10 PM To: @.>; Cc: @.**@.>; Subject: Re: [microsoft/SoftTeacher] Training on a custom dataset (Issue #163)

{ "info": { "description": "description pf your data", "url": "None", "version": "x.0", "year": 20xx, "contributor": "Your name", "date_created": "20xx/xx/xx" }, "images": [ { "height": xxx, "width": xxx, "id": 1100, "file_name": "xxx.png" }, { "height": xxx, "width": xxx, "id": 1101, "file_name": "xxx.png" }, .... { "height": xxx, "width": xxx, "id": 1200, "file_name": "xxx.png" } ], "categories": [ { "supercategory": "none", "id": 0, "name": "None" }, { "supercategory": "None", "id": 1, "name": "name of class1" }, { "supercategory": "None", "id": 2, "name": "name of class2" } ], "annotations": [ { "segmentation": [], "iscrowd": 0, "area": xxx, "image_id": 1100, "bbox": [ xx, xx, xx, xx ], "category_id": 1, "id": 0 }, { "segmentation": [], "iscrowd": 0, "area": xxx, "image_id": 1101, "bbox": [ xx, xx, xx, xx ], "category_id": 1, "id": 1 }, ..... ] } }, The annotation of data should be represented like this in the JSON file. @joeyslv
ُاث I understand ，Thank you so much for your help.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

microsoft / SoftTeacher

Training on a custom dataset #163