Discussion about the resample strategy for LVIS dataset

open-mmlab / mmdetection

OpenMMLab Detection Toolbox and Benchmark

https://mmdetection.readthedocs.io

Apache License 2.0

29.21k stars 9.4k forks source link

Discussion about the resample strategy for LVIS dataset #3022

Closed tonysy closed 3 years ago

tonysy commented 4 years ago

The mmdetection is epoch-based in training. The resampling strategy implemented in mmdetection is adjusting the number of each image within one epoch, according to the repeat factor. In this way, the overall iterations of 24 epoch will be larger than without-resample.

While the detectron2 is iteration-based training, and the overall iteration number is fixed. Compared with mmdetection, the head classes are less trained in detectron2 2x setting, actually. So there exists a big difference in the performance between these two types of training strategy.

Actually, I wonder how to define 1x or 2x for the training strategy? based on the total iterations? or based on the epoch number?

xvjiarui commented 4 years ago

In MMDetection convention, `1x`, `2x` are in terms of epochs, 12 epochs and 24 epochs respectively. It is indeed slightly less than Detectron2 `1x`, `2x` in terms of iterations. The reason for the differences in performance is twofold. Firstly, the performance on LVIS usually fluctuates compared with COCO. Secondly, you are correct about the different iteration numbers between the two codebases. Notes: Here is the difference of iteration number between two codebases on COCO.		MMDetection
1x	87,960	90,000
2x	175,920	180,000
3x	263,880	270,000

tonysy commented 4 years ago

@xvjiarui Thanks.

There exists an another issue except the before discussed.

I also find a difference between the repeat factor based sampling, between mmdetection and detectron2.

In mmdetection: Repeat factor

repeat_indices.extend([dataset_index] * math.ceil(repeat_factor))

in this way, the total image for one epoch will increase from 56740 to 70820

In detectron2: Repeat factor

self._int_part = torch.trunc(repeat_factors)

in this way, the total image for one epoch will increase from 56740 to 61152

The gap of different implementations is large(70820-61152=9667).

tonysy commented 4 years ago

Number of iterations between two codebase on LVIS

	MMDetection	Detectron2
2x	90000	106248

pengzhiliang commented 4 years ago

@tonysy Hello, I found that 28 hours (4 TITAN GPU) need be spent for training in mmdetection, while 20 hours is enough for detectron2 in the same config (without resample strategy) and hardware environment. Do you meet it ?

tonysy commented 4 years ago

@pengzhiliang Yes, similarly. The total iterations of mmdetection are larger than detectron2 on LVIS, as listed above(106248 v.s. 90000)

pengzhiliang commented 4 years ago

@tonysy I know it, but the difference of trianing time is suprising (about 1.5 x)

tonysy commented 4 years ago

The mmdetection will conduct an evaluation of each epoch while the detectron2 won't. This may the reason for longer time.

pengzhiliang commented 4 years ago

@tonysy I had changed it to perform evalution every 12 epoch, so it should has little influence. And can you give some conjecture about it ? @xvjiarui Thank u!

xvjiarui commented 3 years ago

@pengzhiliang You may try LVIS v1 now.