Unusual Z-Value Predictions and Multiple Predictions per Class on Custom Dataset

Shromm-Gaind commented 9 months ago

Prerequisites

[x] Searched through existing Issues and Discussions without finding a solution. Also reviewed relevant GitHub issues.
[x] Experimented with adjusting score_thr to improve visualization outcomes.
[x] Confirmed that the unexpected results are not due to errors in the training dataset.

Issue Description: I am encountering issues with unrealistic prediction values and multiple predictions for the same object class while training a model on a custom dataset. Specifically, some predicted bounding boxes have z-values that are highly unrealistic (e.g., a maximum z-value of 4000 meters) and multiple predictions for a single class object. I am trying to understand how the network generates these predictions and what might be causing such unusual results.

Environment

Docker environment built from a Dockerfile.
GPU: NVIDIA GeForce RTX 3090.

Dataset and Model Performance:

Custom dataset with 26 classes, comprised of 2000 point clouds each for training and validation.
Achieved mAP@50 of 0.76 and mAP@25 of 0.87.

Issue Details During visualization tests, I observed some bounding boxes with unrealistic z-values. For example, in a generated text file, the format '{label} {x_min} {y_min} {z_min} {x_max} {y_max} {z_max}' shows z-values that are significantly off, such as (an example from the dataset):

(Link to the problematic bounding box file:1840_boxes.txt)

6 0.7325 0.0170 -8.6887 0.8170 0.1027 11.7481 6 0.4817 -0.1613 -321.8918 0.5632 -0.0771 325.0209 6 0.6177 -0.0779 -9.3888 0.7026 0.0087 12.4312

In particular, for class ID 6, which showed a mAP@50 of 0.50, and class ID 7, with a mAP@50 of 0.07, there were notably inaccurate predictions. While I understand that lower mAP scores might lead to poorer predictions, I have adjusted the score_thr to 0.6 for the following example, but it made no changes to the predicted bounding boxes. It did however decrease the mAP@50 to .73.

Questions:

How does the network produce such unrealistic z-value predictions?
How could I filter out these bad predictions?

Steps Taken I have successfully created and trained the model on my custom dataset following the mmdetection3d documentation. The anomaly was detected during post-training visualization of the detection results.

filaPro commented 9 months ago

Achieved mAP@50 of 0.76 and mAP@25 of 0.87

Looks like the metrics are very good. So these large boxes should have low scores. What scores do they have?

Shromm-Gaind commented 9 months ago

Clarification and Results Update:

I would like to clarify regarding the performance metrics for class ID 6 and class ID 7. Specifically, I had mistakenly referred to the metrics as mAP when I intended to say AP (Average Precision). Additionally, I have found that by adjusting the score_thr to 0.7, I was able to effectively filter out some of the inaccurate predictions that were previously mentioned at the expense of mAP obviously.

Updated Performance Metrics:

Below are the detailed results for each class: +----------------+---------+---------+---------+---------+ | classes | AP_0.25 | AR_0.25 | AP_0.50 | AR_0.50 | +----------------+---------+---------+---------+---------+ | Sitting | 0.9978 | 0.9979 | 0.9869 | 0.9918 | | Snout | 0.9961 | 0.9964 | 0.9662 | 0.9704 | | Neck | 0.9027 | 0.9495 | 0.5060 | 0.6835 | | Base left ear | 0.0006 | 0.0409 | 0.0003 | 0.0288 | | Tip left ear | 0.9663 | 0.9796 | 0.7737 | 0.8437 | | Left shoulder | 0.9912 | 0.9940 | 0.9840 | 0.9880 | | Left elbow | 0.9711 | 0.9929 | 0.8876 | 0.9365 | | Left hand | 0.9878 | 0.9932 | 0.9597 | 0.9694 | | Right hand | 0.9784 | 0.9812 | 0.9430 | 0.9471 | | Left flank | 0.7280 | 0.8568 | 0.2968 | 0.5339 | | Left hip | 0.9896 | 0.9921 | 0.9523 | 0.9615 | | Left knee | 0.9594 | 0.9708 | 0.9033 | 0.9248 | | Left foot | 0.9766 | 0.9815 | 0.9305 | 0.9372 | | Base tail | 0.9162 | 0.9727 | 0.5688 | 0.7575 | | Tip tail | 0.8337 | 0.8668 | 0.4773 | 0.6197 | | Base right ear | 0.0000 | 0.0000 | 0.0000 | 0.0000 | | Tip right ear | 0.9664 | 0.9734 | 0.7627 | 0.8299 | | Right shoulder | 0.9858 | 0.9878 | 0.9496 | 0.9619 | | Right hip | 0.9911 | 0.9926 | 0.9727 | 0.9763 | | Right knee | 0.9845 | 0.9921 | 0.9459 | 0.9597 | | Right foot | 0.9743 | 0.9761 | 0.9389 | 0.9427 | | Sternally | 0.9478 | 0.9490 | 0.8346 | 0.8673 | | Right elbow | 0.9743 | 0.9828 | 0.9320 | 0.9398 | | Right flank | 0.8306 | 0.9115 | 0.5006 | 0.6609 | | Laterally | 0.9799 | 0.9838 | 0.9469 | 0.9575 | | Standing | 0.9980 | 0.9980 | 0.9364 | 0.9519 | +----------------+---------+---------+---------+---------+ | Overall | 0.8780 | 0.8967 | 0.7637 | 0.8131 | +----------------+---------+---------+---------+---------+

Shromm-Gaind commented 9 months ago

here is the defined order of classes within our dataset:

CLASSES = ('Standing', 'Sitting', 'Sternally', 'Laterally', 'Snout', 'Neck', 'Base left ear', 'Base right ear', 'Tip left ear', 'Tip right ear', 'Left shoulder', 'Right shoulder', 'Left elbow', 'Right elbow', 'Left hand', 'Right hand', 'Left flank', 'Right flank', 'Left hip', 'Right hip', 'Left knee', 'Right knee', 'Left foot', 'Right foot', 'Base tail', 'Tip tail')

Regarding the specific examples I mentioned earlier, where I observed unrealistic prediction values for the z-axis in the bounding boxes, those were related to the classes with ID 6 ("Base right ear") and ID 7 ("Base left ear"). It's important to note that these results were obtained using a score_thr of 0.3.

By adjusting the score_thr to 0.7, I was able to effectively filter out some of the inaccurate predictions for these classes. But it dropped my metric to 0.2152 mAP@50

filaPro commented 9 months ago

I see you have great metrics on all classes except these Base right ear and Base left ear. Do you have them in your training data? Are their gt boxes valid? I simply recommend to remove these 2 classes from validation as they have zero accuracy. This will solve your problem even without changing score_thr. Or you can increase score_thr only for these 2 classes.

Shromm-Gaind commented 9 months ago

I do have them in my training data, I had thought of this. I will look through my data more carefully. What would you suggest for multiple predictions for one class as shown in this example: Right Foot despite having quite a high metric has three predicted bounding boxes, would it just be to increase the score_thr? 24 -0.06998 -0.29320 1.23844 0.02683 -0.19450 1.33211 24 -0.21474 -0.31352 1.28769 -0.03122 -0.12543 1.42425 24 -0.03547 -0.34199 1.29211 0.12538 -0.17522 1.41214 259_boxes.txt

filaPro commented 9 months ago

Try to play with iou_thr in your config. This is NMS parameter regarding how much 2 boxes should intersect to be recognized as duplicates.

Shromm-Gaind commented 9 months ago

That did fix my problem, thanks a lot

SamsungLabs / tr3d

Unusual Z-Value Predictions and Multiple Predictions per Class on Custom Dataset #22