ZrrSkywalker / MonoDETR

[ICCV 2023] The first DETR model for monocular 3D object detection with depth-guided transformer
327 stars 31 forks source link

How to train with multiple GPUs #62

Open HaiJuntang opened 2 months ago

HaiJuntang commented 2 months ago

将monodetr.yaml配置成 gpu_ids: [0,1,2,3],进行多卡训练出现以下错误 Traceback (most recent call last): | 0/464 [00:00<?, ?it/s] File "tools/train_val.py", line 113, in main() File "tools/train_val.py", line 100, in main trainer.train() File "/media/data2/tanghaijun/newMonoDETR/MonoDETR-main/lib/helpers/trainer_helper.py", line 76, in train self.train_one_epoch(epoch) File "/media/data2/tanghaijun/newMonoDETR/MonoDETR-main/lib/helpers/trainer_helper.py", line 137, in train_one_epoch outputs = self.model(inputs, calibs, targets, img_sizes, dn_args=dn_args) File "/media/data2/tanghaijun/anaconda3/envs/newMonoDETR/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(*input, kwargs) File "/media/data2/tanghaijun/anaconda3/envs/newMonoDETR/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 168, in forward outputs = self.parallel_apply(replicas, inputs, kwargs) File "/media/data2/tanghaijun/anaconda3/envs/newMonoDETR/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 178, in parallel_apply return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)]) File "/media/data2/tanghaijun/anaconda3/envs/newMonoDETR/lib/python3.8/site-packages/torch/nn/parallel/parallel_apply.py", line 86, in parallel_apply output.reraise() File "/media/data2/tanghaijun/anaconda3/envs/newMonoDETR/lib/python3.8/site-packages/torch/_utils.py", line 457, in reraise raise exception TypeError: Caught TypeError in replica 1 on device 1. Original Traceback (most recent call last): File "/media/data2/tanghaijun/anaconda3/envs/newMonoDETR/lib/python3.8/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker output = module(*input, *kwargs) File "/media/data2/tanghaijun/anaconda3/envs/newMonoDETR/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(input, kwargs) TypeError: forward() missing 4 required positional arguments: 'images', 'calibs', 'targets', and 'img_sizes'

shawnnnkb commented 1 month ago

have you solved this problem?