Use tools/train.py to train RepPoints Network, and you will get RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)
The last lines of Traceback is
File "xxxxxxxx\mmdetection\mmdet\models\task_modules\assigners\point_assigner.py", line 114, in assign points_index = points_range[lvl_idx]
According to my Github Desktop and memory, I have not modified any code of mmdetection.
I use CityPersons Dataset. However, I believe the problem is not relative to the dataset.
Environment
sys.platform: win32
Python: 3.10.9 | packaged by Anaconda, Inc. | (main, Mar 1 2023, 18:18:15) [MSC v.1916 64 bit (AMD64)]
CUDA available: True
numpy_random_seed: 2147483648
GPU 0: NVIDIA GeForce RTX 3050 Laptop GPU
CUDA_HOME: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6
NVCC: Cuda compilation tools, release 11.6, V11.6.124
MSVC: 用于 x64 的 Microsoft (R) C/C++ 优化编译器 19.35.32217.1 版
GCC: n/a
PyTorch: 1.13.1
PyTorch compiling details: PyTorch built with:
C++ Version: 199711
MSVC 192829337
Intel(R) Math Kernel Library Version 2020.0.2 Product Build 20200624 for Intel(R) 64 architecture applications
Error traceback
Traceback (most recent call last):
File "path_to_my_project\train.py", line 133, in
main()
File "path_to_my_project\train.py", line 129, in main
runner.train()
File "C:\Users\HuShi\anaconda3\lib\site-packages\mmengine\runner\runner.py", line 1721, in train
model = self.train_loop.run() # type: ignore
File "C:\Users\HuShi\anaconda3\lib\site-packages\mmengine\runner\loops.py", line 278, in run
self.run_iter(data_batch)
File "C:\Users\HuShi\anaconda3\lib\site-packages\mmengine\runner\loops.py", line 301, in run_iter
outputs = self.runner.model.train_step(
File "C:\Users\HuShi\anaconda3\lib\site-packages\mmengine\model\base_model\base_model.py", line 114, in train_step
losses = self._run_forward(data, mode='loss') # type: ignore
File "C:\Users\HuShi\anaconda3\lib\site-packages\mmengine\model\base_model\base_model.py", line 340, in _run_forward
results = self(*data, mode=mode)
File "C:\Users\HuShi\anaconda3\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
return forward_call(input, *kwargs)
File "xxxxx\mmdetection\mmdet\models\detectors\base.py", line 92, in forward
return self.loss(inputs, data_samples)
File "xxxxx\mmdetection\mmdet\models\detectors\single_stage.py", line 78, in loss
losses = self.bbox_head.loss(x, batch_data_samples)
File "xxxxx\mmdetection\mmdet\models\dense_heads\base_dense_head.py", line 123, in loss
losses = self.loss_by_feat(loss_inputs)
File "xxxxx\mmdetection\mmdet\models\dense_heads\reppoints_head.py", line 705, in loss_by_feat
cls_reg_targets_init = self.get_targets(
File "xxxxx\mmdetection\mmdet\models\dense_heads\reppoints_head.py", line 561, in get_targets
sampling_results_list) = multi_apply(
File "xxxxx\mmdetection\mmdet\models\utils\misc.py", line 219, in multi_apply
return tuple(map(list, zip(*map_results)))
File "xxxxx\warpnet\mmdetection\mmdet\models\dense_heads\reppoints_head.py", line 444, in _get_targets_single
assign_result = assigner.assign(pred_instances, gt_instances,
File "xxxxx\warpnet\mmdetection\mmdet\models\task_modules\assigners\point_assigner.py", line 114, in assign
points_index = points_range[lvl_idx]
RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)
Bug fix
Initializing points_range with the device of points_lvl will solve the problem.
I will create a PR to fix it.
I have run into the same issue, looks like an upgrade to torch caused this issue as issue as not present in torch1.10. This is the check in torch that throws the error. Any way this can be fixed in next release ?
Use tools/train.py to train RepPoints Network, and you will get
RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)
The last lines ofTraceback
isFile "xxxxxxxx\mmdetection\mmdet\models\task_modules\assigners\point_assigner.py", line 114, in assign points_index = points_range[lvl_idx]
According to my Github Desktop and memory, I have not modified any code of mmdetection. I use CityPersons Dataset. However, I believe the problem is not relative to the dataset.Environment sys.platform: win32 Python: 3.10.9 | packaged by Anaconda, Inc. | (main, Mar 1 2023, 18:18:15) [MSC v.1916 64 bit (AMD64)] CUDA available: True numpy_random_seed: 2147483648 GPU 0: NVIDIA GeForce RTX 3050 Laptop GPU CUDA_HOME: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6 NVCC: Cuda compilation tools, release 11.6, V11.6.124 MSVC: 用于 x64 的 Microsoft (R) C/C++ 优化编译器 19.35.32217.1 版 GCC: n/a PyTorch: 1.13.1 PyTorch compiling details: PyTorch built with:
TorchVision: 0.14.1 OpenCV: 4.7.0 MMEngine: 0.7.3 MMDetection: 3.0.0+ecac3a7
Error traceback Traceback (most recent call last): File "path_to_my_project\train.py", line 133, in
main()
File "path_to_my_project\train.py", line 129, in main
runner.train()
File "C:\Users\HuShi\anaconda3\lib\site-packages\mmengine\runner\runner.py", line 1721, in train
model = self.train_loop.run() # type: ignore
File "C:\Users\HuShi\anaconda3\lib\site-packages\mmengine\runner\loops.py", line 278, in run
self.run_iter(data_batch)
File "C:\Users\HuShi\anaconda3\lib\site-packages\mmengine\runner\loops.py", line 301, in run_iter
outputs = self.runner.model.train_step(
File "C:\Users\HuShi\anaconda3\lib\site-packages\mmengine\model\base_model\base_model.py", line 114, in train_step
losses = self._run_forward(data, mode='loss') # type: ignore
File "C:\Users\HuShi\anaconda3\lib\site-packages\mmengine\model\base_model\base_model.py", line 340, in _run_forward
results = self(*data, mode=mode)
File "C:\Users\HuShi\anaconda3\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
return forward_call(input, *kwargs)
File "xxxxx\mmdetection\mmdet\models\detectors\base.py", line 92, in forward
return self.loss(inputs, data_samples)
File "xxxxx\mmdetection\mmdet\models\detectors\single_stage.py", line 78, in loss
losses = self.bbox_head.loss(x, batch_data_samples)
File "xxxxx\mmdetection\mmdet\models\dense_heads\base_dense_head.py", line 123, in loss
losses = self.loss_by_feat(loss_inputs)
File "xxxxx\mmdetection\mmdet\models\dense_heads\reppoints_head.py", line 705, in loss_by_feat
cls_reg_targets_init = self.get_targets(
File "xxxxx\mmdetection\mmdet\models\dense_heads\reppoints_head.py", line 561, in get_targets
sampling_results_list) = multi_apply(
File "xxxxx\mmdetection\mmdet\models\utils\misc.py", line 219, in multi_apply
return tuple(map(list, zip(*map_results)))
File "xxxxx\warpnet\mmdetection\mmdet\models\dense_heads\reppoints_head.py", line 444, in _get_targets_single
assign_result = assigner.assign(pred_instances, gt_instances, File "xxxxx\warpnet\mmdetection\mmdet\models\task_modules\assigners\point_assigner.py", line 114, in assign
points_index = points_range[lvl_idx] RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)
Bug fix Initializing
points_range
with the device ofpoints_lvl
will solve the problem. I will create a PR to fix it.