Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1!

yuantn / MI-AOD

Code for Multiple Instance Active Learning for Object Detection, CVPR 2021

https://openaccess.thecvf.com/content/CVPR2021/papers/Yuan_Multiple_Instance_Active_Learning_for_Object_Detection_CVPR_2021_paper.pdf

Apache License 2.0

333 stars 43 forks source link

Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1! #61

Closed jiahao12121 closed 2 years ago

jiahao12121 commented 2 years ago

1、搭好环境后，想用test_single.py测试一张图看看，由于0号卡被占用，所以想使用1号卡，test_single.py中代码修改为device:"cuda:1"，但还是说显存不够，发现load checkpoint的时候始终会占用0号卡的显存，不知什么原因。然后又想改为cpu测，直接报错，0号卡依然会占用一些显存。

2、想使用1号卡训练，使用python tools/train.py configs/MIAOD.py --gpu-ids 1，但报错说必须使用同一台设备，所以请问单GPU训练只能用0号卡吗？

yuantn commented 2 years ago

建议通过 export CUDA_VISIBLE_DEVICES=$GPU_IDS 的方式设置训练或测试的 GPU，其中 $GPU_IDS 为要使用的 GPU 编号，以半角字符下的逗号隔开。

It is recommended to set the GPU for training or testing by export CUDA_VISIBLE_DEVICES=$GPU_IDS, where $GPU_IDS is the GPU number to be used, separated by commas under half-width characters.

Skywalker-Harrison commented 2 years ago

I don't think this would solve the probelm

建议通过 export CUDA_VISIBLE_DEVICES=$GPU_IDS 的方式设置训练或测试的 GPU，其中 $GPU_IDS 为要使用的 GPU 编号，以半角字符下的逗号隔开。

It is recommended to set the GPU for training or testing by export CUDA_VISIBLE_DEVICES=$GPU_IDS, where $GPU_IDS is the GPU number to be used, separated by commas under half-width characters.

I don't think this would solve the problem. I tried the command export CUDA_VISIBLE_DEVICES=$GPU_IDS, however when I set two or more GPUs for training, it popped up the error. I can only set one GPU for successfully executing the training.

yuantn commented 2 years ago

I don't think this would solve the probelm

建议通过 export CUDA_VISIBLE_DEVICES=$GPU_IDS 的方式设置训练或测试的 GPU，其中 $GPU_IDS 为要使用的 GPU 编号，以半角字符下的逗号隔开。 It is recommended to set the GPU for training or testing by export CUDA_VISIBLE_DEVICES=$GPU_IDS, where $GPU_IDS is the GPU number to be used, separated by commas under half-width characters.

I don't think this would solve the problem. I tried the command export CUDA_VISIBLE_DEVICES=$GPU_IDS, however when I set two or more GPUs for training, it popped up the error. I can only set one GPU for successfully executing the training.

Please provide your specific export command, the training command and complete error log. Note that the two commands should be executed adjacently in the same terminal.