A question about the parameter `masks`

hamarh / HMNet_pth

PyTorch implementation of Hierarchical Neural Memory Network

BSD 3-Clause "New" or "Revised" License

36 stars 5 forks source link

Hi, I've noticed in your code that there are several parameters related to masks, including list_ignore_masks, ignore_masks, and out_mask. I'm having some difficulty understanding the purpose of these mask parameters in your code.

Based on my understanding of the original YOLOX code, masks were used for the 'Mosaic+Mixup' data augmentation technique. However, in your HMNet implementation, I observed that you primarily employ resize, crop, and horizontal flip operations for data augmentation, which seems unrelated to the use of masks.

After reading your whole code, I still don't figure out the functionality of mask. Could you please provide some clarification or explanation regarding their role within HMNet? I'd greatly appreciate it.

ignore_masks is used to flag the ground truth bboxes that should be ignored during loss calculation.

The flag is set in gen1.py for two cases: (1) The bbox size is too small (L307) (2) The bbox is marked as invalid in validate_bbox.py (L333)

Since the original YOLOX code does not have the functionality for ignoring certain bboxes, we have added the functionality in det_head_yolox.py as follows.

L568-L580:

Make valid_fg masks representing which GT bboxes should be involved in loss calculation.
Make valid masks representing which PRED bboxes should be involved in loss calculation.
Filtering GT bboxes based on valid_fg masks.

L608-L613

Filtering PRED bboxes based on valid masks.

Simply removing the ignore bboxes at data loading may induce noise for training because the removed bbox regions are wrongly treated as negative when calculating loss. To correctly ignore the bboxes, we have to explicitly pass ignore_masks to the YOLOX head so that it can skip loss calculation for the predictions assigned to the ignore bboxes.

hamarh / HMNet_pth

A question about the parameter `masks` #3