When I train with a single GPU with command "python tools/train.py ${CONFIG_FILE}", the following error happen
`2023-07-19 15:31:55,068 - INFO - Distributed training: False
2023-07-19 15:31:56,207 - INFO - load model from: modelzoo://resnet50
2023-07-19 15:31:56,401 - WARNING - The model and loaded state dict do not match exactly
unexpected key in source state_dict: fc.weight, fc.bias
2023-07-19 15:31:56,594 - WARNING - The model and loaded state dict do not match exactly
unexpected key in source state_dict: fc.weight, fc.bias
loading annotations into memory...
Done (t=2.00s)
creating index...
index created!
loading annotations into memory...
Done (t=2.75s)
creating index...
index created!
2023-07-19 15:32:06,836 - INFO - Start running, host: root@autodl-container-e1cd11a652-fc9fd137, work_dir: /root/UA-CMDet/work_dirs/UACMDet
2023-07-19 15:32:06,836 - INFO - workflow: [('train', 1)], max: 12 epochs
Traceback (most recent call last):
File "tools/train.py", line 99, in
main()
File "tools/train.py", line 95, in main
logger=logger)
File "/root/UA-CMDet/mmdet/apis/train.py", line 61, in train_detector
_non_dist_train(model, dataset, cfg, validate=validate)
File "/root/UA-CMDet/mmdet/apis/train.py", line 197, in _non_dist_train
runner.run(data_loaders, cfg.workflow, cfg.total_epochs)
File "/root/miniconda3/envs/UA-CMDet/lib/python3.7/site-packages/mmcv/runner/runner.py", line 364, in run
epoch_runner(data_loaders[i], **kwargs)
File "/root/miniconda3/envs/UA-CMDet/lib/python3.7/site-packages/mmcv/runner/runner.py", line 275, in train
self.call_hook('after_train_iter')
File "/root/miniconda3/envs/UA-CMDet/lib/python3.7/site-packages/mmcv/runner/runner.py", line 231, in call_hook
getattr(hook, fn_name)(self)
File "/root/miniconda3/envs/UA-CMDet/lib/python3.7/site-packages/mmcv/runner/hooks/optimizer.py", line 18, in after_train_iter
runner.outputs['loss'].backward()
File "/root/miniconda3/envs/UA-CMDet/lib/python3.7/site-packages/torch/tensor.py", line 118, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/root/miniconda3/envs/UA-CMDet/lib/python3.7/site-packages/torch/autograd/init.py", line 93, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [1024, 256, 7, 7]], which is output 0 of IndexPutBackward, is at version 5; expected version 4 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!`
When I train with a single GPU with command "python tools/train.py ${CONFIG_FILE}", the following error happen `2023-07-19 15:31:55,068 - INFO - Distributed training: False 2023-07-19 15:31:56,207 - INFO - load model from: modelzoo://resnet50 2023-07-19 15:31:56,401 - WARNING - The model and loaded state dict do not match exactly
unexpected key in source state_dict: fc.weight, fc.bias
2023-07-19 15:31:56,594 - WARNING - The model and loaded state dict do not match exactly
unexpected key in source state_dict: fc.weight, fc.bias
loading annotations into memory... Done (t=2.00s) creating index... index created! loading annotations into memory... Done (t=2.75s) creating index... index created! 2023-07-19 15:32:06,836 - INFO - Start running, host: root@autodl-container-e1cd11a652-fc9fd137, work_dir: /root/UA-CMDet/work_dirs/UACMDet 2023-07-19 15:32:06,836 - INFO - workflow: [('train', 1)], max: 12 epochs Traceback (most recent call last): File "tools/train.py", line 99, in
main()
File "tools/train.py", line 95, in main
logger=logger)
File "/root/UA-CMDet/mmdet/apis/train.py", line 61, in train_detector
_non_dist_train(model, dataset, cfg, validate=validate)
File "/root/UA-CMDet/mmdet/apis/train.py", line 197, in _non_dist_train
runner.run(data_loaders, cfg.workflow, cfg.total_epochs)
File "/root/miniconda3/envs/UA-CMDet/lib/python3.7/site-packages/mmcv/runner/runner.py", line 364, in run
epoch_runner(data_loaders[i], **kwargs)
File "/root/miniconda3/envs/UA-CMDet/lib/python3.7/site-packages/mmcv/runner/runner.py", line 275, in train
self.call_hook('after_train_iter')
File "/root/miniconda3/envs/UA-CMDet/lib/python3.7/site-packages/mmcv/runner/runner.py", line 231, in call_hook
getattr(hook, fn_name)(self)
File "/root/miniconda3/envs/UA-CMDet/lib/python3.7/site-packages/mmcv/runner/hooks/optimizer.py", line 18, in after_train_iter
runner.outputs['loss'].backward()
File "/root/miniconda3/envs/UA-CMDet/lib/python3.7/site-packages/torch/tensor.py", line 118, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/root/miniconda3/envs/UA-CMDet/lib/python3.7/site-packages/torch/autograd/init.py", line 93, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [1024, 256, 7, 7]], which is output 0 of IndexPutBackward, is at version 5; expected version 4 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!`