hustvl / SparseInst

[CVPR 2022] SparseInst: Sparse Instance Activation for Real-Time Instance Segmentation
MIT License
587 stars 72 forks source link

Bad performence and slow convergence for my own custom dataset #48

Open Vincent630 opened 2 years ago

Vincent630 commented 2 years ago

thanks for your briliant work ,because it's efficent inference and elegent framework ,i am so interested in sparseinst.(none postprocess and IAM stuff ).but i got bad perference on my own dataset with sparseinst. visulize result is so bad(compare solo and yolact),its nearly can not get one good result on the train set.if that is not so borther ,please give me some help and i will appriciate it so much.(if that is possible i can provide a small dadaset include about 500 images) train set:5000+,and i follow all default set of sparseinst.

Evaluate annotation type segm COCOeval_opt.evaluate() finished in 14.69 seconds. Accumulating evaluation results... COCOeval_opt.accumulate() finished in 1.32 seconds. Average Precision (AP) @[ IoU=0.50:0.95 area= all maxDets=100 ] = 0.693 Average Precision (AP) @[ IoU=0.50 area= all maxDets=100 ] = 0.822 Average Precision (AP) @[ IoU=0.75 area= all maxDets=100 ] = 0.742 Average Precision (AP) @[ IoU=0.50:0.95 area= small maxDets=100 ] = 0.003 Average Precision (AP) @[ IoU=0.50:0.95 area=medium maxDets=100 ] = 0.056 Average Precision (AP) @[ IoU=0.50:0.95 area= large maxDets=100 ] = 0.863 Average Recall (AR) @[ IoU=0.50:0.95 area= all maxDets= 1 ] = 0.649 Average Recall (AR) @[ IoU=0.50:0.95 area= all maxDets= 10 ] = 0.712 Average Recall (AR) @[ IoU=0.50:0.95 area= all maxDets=100 ] = 0.718 Average Recall (AR) @[ IoU=0.50:0.95 area= small maxDets=100 ] = 0.033 Average Recall (AR) @[ IoU=0.50:0.95 area=medium maxDets=100 ] = 0.162 Average Recall (AR) @[ IoU=0.50:0.95 area= large maxDets=100 ] = 0.881 [07/01 08:58:51 d2.evaluation.coco_evaluation]: Evaluation results for segm: AP AP50 AP75 APs APm APl
69.257 82.168 74.232 0.290 5.635 86.327
[07/01 08:58:51 d2.evaluation.coco_evaluation]: Per-category segm AP: category AP category AP category AP
background nan ground 57.630 carpet 80.885

[07/01 08:58:52 d2.engine.defaults]: Evaluation results for coco_2017_val in csv format: [07/01 08:58:52 d2.evaluation.testing]: copypaste: Task: segm [07/01 08:58:52 d2.evaluation.testing]: copypaste: AP,AP50,AP75,APs,APm,APl [07/01 08:58:52 d2.evaluation.testing]: copypaste: 69.2574,82.1681,74.2320,0.2904,5.6352,86.3270

wondervictor commented 2 years ago

Hi @Vincent630, thanks for your interest in SparseInst and I'd like to fix it! It seems that the visualization results are very bad, which is being explored in #42. I'm solving it now and might offer you some suggestions soon.

Vincent630 commented 2 years ago

thank you so much, please let me konw if you got any process.

Vincent630 commented 2 years ago

i can also offer some badcase sample so you may konw what i mean,of course i have some unthoughtful thoughts, if there any possible that is because of the heatmap thing cause this ?sparseinst use a coarse heatmap which may include some other instance feature so that may infect the boundary of instance and instance separating. anyway i think sparseinst is very elegent and instinct resolution ,hope to cover widly in instance segmentation. 000003 000008 000011 000029 000034

000033

wondervictor commented 2 years ago

@Vincent630, are the evaluation metrics are normal, e.g., 69.3 AP and inference speed?

Vincent630 commented 2 years ago

yes,that looks fine,and i have evaluate multi iteration and compare its result with the default config final model but still find that almost all evalute result got so many badcase

Vincent630 commented 2 years ago

this is the evalution metrics with test script,it seems fine for me ,my envrionment is tesla T4 . image

by the way,i have two qustion wich is out of this subject, 1) when i change config "IMS_PER_BATCH:",for example set as 16, the total iteration 270000 cost 6 datas on two gpu ,but when i set as 32,same iteration cost about double time(6 days..),i don't konw what's that mean...... time reduce when maxmize batch size make sense for me. 2) when training processing finished,we can get two kind model,"instances_predictions.pth" and "model_final.pth"(iteration checkpoint),my question is ,how to ues this "instances_predictions.pth" and what's the difference between instances_predictions and model_final?(i have used this "instances_predictions" model to visulize but that report some "key error " stuff with offical demo.py script)

wondervictor commented 2 years ago

Hi @Vincent630, we adopt 64 images per batch on 8 GPUs (11 GB memory, 8 images per GPU) for COCO training. Reducing batch size might affect the performance a bit, e.g., -0.8 AP after reducing the batch size from 64 to 32. For your task, you can reduce the batch size to fit your GPU and reduce the iterations (e.g., 180k) for faster training. For the second question, model_final.pth is the ultimate state dicts, including model weights, scheduler, optimizer state dict. instances_predictions.pth contains the raw predictions on the val set. For visualization, you need to load the pre-trained weights, i.e., model_final.pth.

Vincent630 commented 2 years ago

i am not sure if i figure your answer"instance_predictions is a raw predictions", i thought it should get the right prediction include mask and instance info if it can be inference,otherwise what this "instance prediction model" made for ???OlO