IDEA-Research / detrex

detrex is a research platform for DETR-based object detection, segmentation, pose estimation and other visual recognition tasks.
Apache License 2.0
1.9k stars 199 forks source link

[Bug] FocusDetr report min size error #308

Open icicle4 opened 9 months ago

icicle4 commented 9 months ago

When run python tools/ --config-file projects/focus_detr/configs/focus_detr_resnet/ --num-gpus 8 where train.init_checkpoint = detectron2://ImageNetPretrained/torchvision/R-50.pkl.

It report below error, my dataset is default coco dataset.

Traceback (most recent call last):
  File "tools/", line 313, in <module>
  File "/root/detrex/detectron2/detectron2/engine/", line 79, in launch
  File "/root/miniconda3/envs/detrex/lib/python3.7/site-packages/torch/multiprocessing/", line 230, in spawn
    return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
  File "/root/miniconda3/envs/detrex/lib/python3.7/site-packages/torch/multiprocessing/", line 188, in start_processes
    while not context.join():
  File "/root/miniconda3/envs/detrex/lib/python3.7/site-packages/torch/multiprocessing/", line 150, in join
    raise ProcessRaisedException(msg, error_index,

-- Process 4 terminated with the following error:
Traceback (most recent call last):
  File "/root/miniconda3/envs/detrex/lib/python3.7/site-packages/torch/multiprocessing/", line 59, in _wrap
    fn(i, *args)
  File "/root/detrex/detectron2/detectron2/engine/", line 126, in _distributed_worker
  File "/root/detrex/tools/", line 302, in main
    do_train(args, cfg)
  File "/root/detrex/tools/", line 275, in do_train
    trainer.train(start_iter, cfg.train.max_iter)
  File "/root/detrex/detectron2/detectron2/engine/", line 149, in train
  File "/root/detrex/tools/", line 101, in run_step
    loss_dict = self.model(data)
  File "/root/miniconda3/envs/detrex/lib/python3.7/site-packages/torch/nn/modules/", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/root/miniconda3/envs/detrex/lib/python3.7/site-packages/torch/nn/parallel/", line 886, in forward
    output = self.module(*inputs[0], **kwargs[0])
  File "/root/miniconda3/envs/detrex/lib/python3.7/site-packages/torch/nn/modules/", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/root/detrex/projects/focus_detr/modeling/", line 269, in forward
    loss_dict = self.criterion(output, targets, dn_meta)
  File "/root/miniconda3/envs/detrex/lib/python3.7/site-packages/torch/nn/modules/", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/root/detrex/projects/focus_detr/modeling/", line 43, in forward
    losses = super(FOCUS_DETRCriterion, self).forward(outputs, targets)
  File "/root/detrex/projects/focus_detr/modeling/", line 87, in forward
    class_targets = self.target_layer(outputs['srcs'], batch_boxes, batch_classes)
  File "/root/miniconda3/envs/detrex/lib/python3.7/site-packages/torch/nn/modules/", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/root/detrex/projects/focus_detr/modeling/", line 75, in forward
  File "/root/detrex/projects/focus_detr/modeling/", line 129, in _gen_level_targets
    areas_min_ind = torch.min(areas, dim=-1)[1]  # [batch_size,h*w]
IndexError: min(): Expected reduction dim 2 to have non-zero size.
baojunqi commented 6 months ago

same problem bro. Have u addressed this problem?

emotionee commented 6 months ago

我也遇到了这个问题,好像是batch_sz 的数量设置有问题,请问找到解决办法了么? I have also encountered this problem. It seems that there is an issue with the quantity setting of the batch. Have you found a solution?

baojunqi commented 6 months ago

我也遇到了这个问题,好像是batch_sz 的数量设置有问题,请问找到解决办法了么? I have also encountered this problem. It seems that there is an issue with the quantity setting of the batch. Have you found a solution?

How big is your batch size? I tried to train my own dataset on DETR with a batch size of 16, it works well. However, when I tried to train Focus-DETR with a batch size of 8 on 4 A4000, it failed.

SmalWhite commented 4 months ago

have you solved the problem, I meet the same problem.