Open emvollmer opened 1 year ago
I haven't been able to solve the problem but it's definitely an issue with training.
I have 634 training images, 159 test images. The previously displayed logs were running train.py
with a batch_size=2
. That means only 39 * 2 = 78 images are used in training, which coincidentally is the exact number of images that contain a person
annotation. All others are apparently ignored?
I've double-checked both the MMDetection InstanceSeg_Tutorial and general demo to ensure I have the configs adapted correctly for retraining - which was the case. In particular, both cfg.model.roi_head.bbox_head.num_classes
and cfg.model.roi_head.mask_head.num_classes
are set to match the number of classes I have.
I've compared all my configs with their original counterparts here (see mask-rcnn_r50_fpn_1x_coco.py
and the files it inherits from) and am using the current train.py
.
As I'm at a loss as to where else this could be coming from, any help would be greatly appreciated!
(FYI: Another peculiar thing that's happening is that the statement ----FORWARD IS DONE-----
gets printed dozens of times in between log messages. I haven't had that happen using previous versions of MMDet and haven't seen it mentioned here anywhere either...Probably not correlated but still wanted to mention it)
Hi there,
I've previously used MMDet v2 and now have switched to v3 to retrain a MaskRCNN model on a custom dataset. Some background info about my dataset and the adaptations made to customize things:
train.py
andtest.py
are adapted to match my requirements and some custom modules were created where necessary: data loaderCustomDataPreprocessor
andLoadNumpyImageFromFile
(inherits fromLoadImageFromFile
), visualizer hookNumpyDetVisualizationHook
(inherits fromDetVisualizationHook
).Procedure
I used the following commands to call
train.py
andtest.py
:Expected results
Model trained to identify instances in images of all 11 classes.
test.py
call should return overall metrics and plot all annotations and predicted instances side by side in theplots/
folder,test.py
call should show the individual metrics for each of the 10 classes and plot separate images for each class with ground truth annotations and predictions side by side.Actual results
Everything runs without errors, but the
test.py
outputs show only a single class out of the 11 is being displayed in the plots and only the resulting metrics of that class are being shown.test.py
calls plot the same thing - side by side images showing only the ground truth "person" annotations and predicted "person" instances. Both overall and classwise metrics are the same. These are also the same as the outputs from thetrain.py
script, which merely shows the metrics for the test dataset, as I don't have a validation dataset.I can't figure out where things are going wrong - if there's an issue with training or just in the display of the results. The class that is shown ("person") is the 9th out of 11, but is the last class to occur in both train and test datasets going by order of images, so maybe the outputs are being overwritten so only the last one remains?
Thanks in advance for any help, ideas or assistance you can provide! I've added more details below.
Details
Below you'll find config excerpts from the log, which in this case is the same for
train.py
andtest.py
. For evaluation, I used the standardCocoMetric
and, through theDetLocalVisualizer
,DumpDetResults
. Important, relevant changes are in bold.train.py
(displayed using the test dataset, as I don't have a validation one) were the following:023/06/28 10:35:17 - mmengine - INFO - Evaluating bbox... 2023/06/28 10:35:17 - mmengine - INFO - bbox_mAP_copypaste: 0.404 0.685 0.407 0.252 0.419 0.200 2023/06/28 10:35:17 - mmengine - INFO - Evaluating segm... 2023/06/28 10:35:17 - mmengine - INFO - segm_mAP_copypaste: 0.351 0.676 0.301 0.076 0.361 0.300 2023/06/28 10:35:17 - mmengine - INFO - Epoch(val) [60][80/80] coco/bbox_mAP: 0.4040 coco/bbox_mAP_50: 0.6850 coco/bbox_mAP_75: 0.4070 coco/bbox_mAP_s: 0.2520 coco/bbox_mAP_m: 0.4190 coco/bbox_mAP_l: 0.2000 coco/segm_mAP: 0.3510 coco/segm_mAP_50: 0.6760 coco/segm_mAP_75: 0.3010 coco/segm_mAP_s: 0.0760 coco/segm_mAP_m: 0.3610 coco/segm_mAP_l: 0.3000 data_time: 0.0538 time: 0.4626
2023/06/30 10:16:52 - mmengine - WARNING - The prefix is not set in metric class DumpDetResults. 2023/06/30 10:16:54 - mmengine - INFO - Load checkpoint from /.../model/outputs/epoch_60.pth 2023/06/30 10:20:37 - mmengine - INFO - Epoch(test) [ 50/159] eta: 0:07:51 time: 4.3254 data_time: 3.8272 memory: 3966 2023/06/30 10:24:13 - mmengine - INFO - Epoch(test) [100/159] eta: 0:04:16 time: 4.3728 data_time: 4.0667 memory: 3966 2023/06/30 10:27:51 - mmengine - INFO - Epoch(test) [150/159] eta: 0:00:39 time: 4.3717 data_time: 4.0541 memory: 3966 2023/06/30 10:28:27 - mmengine - INFO - Evaluating bbox... 2023/06/30 10:28:27 - mmengine - INFO - bbox_mAP_copypaste: 0.404 0.685 0.407 0.252 0.419 0.200 2023/06/30 10:28:27 - mmengine - INFO - Evaluating segm... 2023/06/30 10:28:27 - mmengine - INFO - segm_mAP_copypaste: 0.350 0.675 0.301 0.076 0.361 0.300 2023/06/30 10:28:27 - mmengine - INFO - Results has been saved to /.../model/outputs/eval/predictions_epoch-60.pickle. 2023/06/30 10:28:27 - mmengine - INFO - Epoch(test) [159/159] coco/bbox_mAP: 0.4040 coco/bbox_mAP_50: 0.6850 coco/bbox_mAP_75: 0.4070 coco/bbox_mAP_s: 0.2520 coco/bbox_mAP_m: 0.4190 coco/bbox_mAP_l: 0.2000 coco/segm_mAP: 0.3500 coco/segm_mAP_50: 0.6750 coco/segm_mAP_75: 0.3010 coco/segm_mAP_s: 0.0760 coco/segm_mAP_m: 0.3610 coco/segm_mAP_l: 0.3000 data_time: 3.9675 time: 4.3337
2023/06/30 14:05:50 - mmengine - WARNING - The prefix is not set in metric class DumpDetResults. 2023/06/30 14:05:57 - mmengine - INFO - Load checkpoint from /.../model/outputs/epoch_60.pth 2023/06/30 14:09:53 - mmengine - INFO - Epoch(test) [ 50/159] eta: 0:08:19 time: 4.5861 data_time: 3.8918 memory: 3966 2023/06/30 14:13:29 - mmengine - INFO - Epoch(test) [100/159] eta: 0:04:24 time: 4.3830 data_time: 4.0796 memory: 3966 2023/06/30 14:17:09 - mmengine - INFO - Epoch(test) [150/159] eta: 0:00:40 time: 4.3968 data_time: 4.0065 memory: 3966 2023/06/30 14:17:44 - mmengine - INFO - Evaluating bbox... 2023/06/30 14:17:44 - mmengine - INFO - +----------+-------+--------+--------+-------+-------+-------+ | category | mAP | mAP_50 | mAP_75 | mAP_s | mAP_m | mAP_l | +----------+-------+--------+--------+-------+-------+-------+ | person | 0.404 | 0.685 | 0.407 | 0.252 | 0.419 | 0.2 | +----------+-------+--------+--------+-------+-------+-------+ 2023/06/30 14:17:44 - mmengine - INFO - bbox_mAP_copypaste: 0.404 0.685 0.407 0.252 0.419 0.200 2023/06/30 14:17:44 - mmengine - INFO - Evaluating segm... 2023/06/30 14:17:45 - mmengine - INFO - +----------+------+--------+--------+-------+-------+-------+ | category | mAP | mAP_50 | mAP_75 | mAP_s | mAP_m | mAP_l | +----------+------+--------+--------+-------+-------+-------+ | person | 0.35 | 0.675 | 0.301 | 0.076 | 0.361 | 0.3 | +----------+------+--------+--------+-------+-------+-------+ 2023/06/30 14:17:45 - mmengine - INFO - segm_mAP_copypaste: 0.350 0.675 0.301 0.076 0.361 0.300 2023/06/30 14:17:45 - mmengine - INFO - Results has been saved to /.../model/outputs/eval_classwise/predictions_epoch-60.pickle. 2023/06/30 14:17:45 - mmengine - INFO - Epoch(test) [159/159] coco/person_precision: 0.3500 coco/bbox_mAP: 0.4040 coco/bbox_mAP_50: 0.6850 coco/bbox_mAP_75: 0.4070 coco/bbox_mAP_s: 0.2520 coco/bbox_mAP_m: 0.4190 coco/bbox_mAP_l: 0.2000 coco/segm_mAP: 0.3500 coco/segm_mAP_50: 0.6750 coco/segm_mAP_75: 0.3010 coco/segm_mAP_s: 0.0760 coco/segm_mAP_m: 0.3610 coco/segm_mAP_l: 0.3000 data_time: 3.9770 time: 4.4266