Megvii-BaseDetection / YOLOX

YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with MegEngine, ONNX, TensorRT, ncnn, and OpenVINO supported. Documentation: https://yolox.readthedocs.io/
Apache License 2.0
9.41k stars 2.21k forks source link

Mosaic error #994

Open rkinas opened 2 years ago

rkinas commented 2 years ago

Hi, pelase consider looking into mosaic function. I probably found error. I was trying to implement Albumentations pipeline and all the time got an error .... so I started to debug code and found that mosaic algorithm generate targets with size 0 ... Look here:

I print only targets in call function

class TrainTransform:
    def __init__(self, max_labels=50, flip_prob=0.5, hsv_prob=1.0):
        self.max_labels = max_labels
        self.flip_prob = flip_prob
        self.hsv_prob = hsv_prob

   def __call__(self, image, targets, input_dim):
        print("TRAIN TRANSFORM")  
        print(targets)   # I print only targets

Case #1 - no mosaic - everything works perfectly

TRAIN TRANSFORM [[647.5 213.75 736.25 282.5 0. ]] TRAIN TRANSFORM [[403.75 111.25 445. 146.25 0. ]] TRAIN TRANSFORM [[ 87.5 732.5 180. 800. 0. ]] TRAIN TRANSFORM [[1358.75 656.25 1417.5 718.75 0. ]] TRAIN TRANSFORM [[492.5 213.75 531.25 260. 0. ] [376.25 336.25 417.5 375. 0. ]]

Case 2 - mosaic prob = 1

[ 924.27274024 960. 1027.35195473 960. 0. ] [1600. 960. 1600. 960. 0. ] [1440.94550157 960. 1600. 960. 0. ] [1600. 960. 1600. 960. 0. ] [1600. 960. 1600. 960. 0. ] [1600. 960. 1600. 960. 0. ] [ 253.86266154 225.81551197 267.52665749 237.32203487 0. ] [ 171.15952818 146.70816701 202.08330848 169.72121282 0. ]]

[ 0. 724.68044514 0. 762.38348968 0. ] [ 793.23250337 602.66065369 852.39074366 659.05887883 0. ] [ 0. 960. 0. 960. 0. ] [ 0. 960. 0. 960. 0. ] [1351.49268456 960. 1395.86564479 960. 0. ] [1306.2176049 960. 1365.95252008 960. 0. ] [1134.91053628 960. 1176.97962312 960. 0. ] [ 897.39548351 960. 930.70017725 960. 0. ] [ 483.42069316 178.34964884 529.4976874 220.8822589 0. ]]

[1600. 0. 1600. 48.58649928 0. ] [1600. 17.36227029 1600. 76.92993114 0. ] [1152.51476424 960. 1250.6877682 960. 0. ] [1600. 960. 1600. 960. 0. ] [1600. 960. 1600. 960. 0. ] [1600. 960. 1600. 960. 0. ] [1600. 960. 1600. 960. 0. ] [1600. 960. 1600. 960. 0. ] [1600. 960. 1600. 960. 0. ] [1600. 960. 1600. 960. 0. ] [1600. 960. 1600. 960. 0. ] [1600. 960. 1600. 960. 0. ] [1600. 960. 1600. 960. 0. ] [1600. 960. 1600. 960. 0. ] [1600. 960. 1600. 960. 0. ] [1600. 960. 1600. 960. 0. ] [1600. 960. 1600. 960. 0. ] [1600. 960. 1600. 960. 0. ] [1600. 960. 1600. 960. 0. ] [1253.83566329 840.65383707 1323.14144987 913.80994513 0. ] [1074.79571461 64.81405945 1142.17634046 141.82048899 0. ]

`self.data_num_workers = 2 self.num_classes = 1

    self.input_size = (960, 1600)
    self.random_size = (10, 25)
    self.test_size = (960, 1600)

    # --------------- transform config ----------------- #
    self.mosaic_prob = 1
    self.mixup_prob = 0.5
    self.hsv_prob = 0
    self.flip_prob = 0.1
    self.degrees = 10.0
    self.translate = 0.1
    self.mosaic_scale = (0.1, 2)
    self.mixup_scale = (0.1, 2)
    self.shear = 2.0
    self.enable_mixup = True
    self.multiscale_range = 5`
FateScript commented 2 years ago

I will have a look, please wait.

rkinas commented 2 years ago

I found problem with bboxes with 0 width and height

Certainly I do not know code as you but some ideas ... to check:

Cleaning clipped bboxes with 0 width or height should be deleted.

https://github.com/Megvii-BaseDetection/YOLOX/blob/dd5700c24693e1852b55ce0cb170342c19943d8b/yolox/data/datasets/mosaicdetection.py#L147

mix_img, padded_labels = self.preproc(mosaic_img, mosaic_labels, self.input_dim)
img_info = (mix_img.shape[1], mix_img.shape[0])

should be like this (I know that it could be implemented better but this is only prototype)

for idx in range(mosaic_labels.shape[0]):
      if ((mosaic_labels[idx,2] - mosaic_labels[idx,0]) == 0) | ((mosaic_labels[idx,3] - mosaic_labels[idx,1]) ==0):
             mos_labels_idx.append(idx)   
mosaic_labels = np.delete(mosaic_labels, mos_labels_idx, axis=0)
mix_img, padded_labels = self.preproc(mosaic_img, mosaic_labels, self.input_dim)
img_info = (mix_img.shape[1], mix_img.shape[0])

I know that this is workaround and you know better your code but ... I will try to contribute and try to answer why we have after mosaic and mixup so little bboxes ... and ... Albumentations pipeline tries exceptions that something is wrong with bbox coordinations.

rkinas commented 2 years ago

And if you are interested in I can give you Albumentations pipeline code change in YoloX. Works perfectly and add many new possibilities for augumenting data.

FateScript commented 2 years ago

And if you are interested in I can give you Albumentations pipeline code change in YoloX. Works perfectly and add many new possibilities for augumenting data.

Sure, you are welcome to contribute.

nminds commented 2 years ago

@rkinas I'd be interested in your integration of Albumentations with yolox as well. Would be great if you could share the code. Many thanks! Cheers