MohamedAfham / CrossPoint

Official implementation of "CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding" (CVPR, 2022)
https://mohamedafham.github.io/CrossPoint/
244 stars 28 forks source link

RuntimeError: CUDA error: invalid device ordinal #20

Open whuhxb opened 1 year ago

whuhxb commented 1 year ago

Hi @MohamedAfham

Have you ever met this bug before? Thanks a lot.

Using GPU : 0 from 1 devices Use Adam Start training epoch: (0/100) /export/home/hanxiaobing/anaconda3/envs/crosspoint/lib/python3.7/site-packages/torch/optim/lr_scheduler.py:134: UserWarning: Detected call of lr_scheduler.step() before optimizer.step(). In PyTorch 1.1.0 and later, you should call them in the opposite order: optimizer.step() before lr_scheduler.step(). Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate "https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate", UserWarning) Traceback (most recent call last): File "train_crosspoint.py", line 261, in train(args, io) File "traincrosspoint.py", line 103, in train , pointfeats, = point_model(data) File "/export/home/hanxiaobing/anaconda3/envs/crosspoint/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, *kwargs) File "/export/home/hanxiaobing/Documents/PlaneNet_PlaneRCNN/DGCNN_PointNet2/SensatUrban/MAE/CrossPoint/models/dgcnn.py", line 95, in forward x = get_graph_feature(x, k=self.k) File "/export/home/hanxiaobing/Documents/PlaneNet_PlaneRCNN/DGCNN_PointNet2/SensatUrban/MAE/CrossPoint/models/dgcnn.py", line 29, in get_graph_feature idx_base = torch.arange(0, batch_size, device=device).view(-1, 1, 1)num_points RuntimeError: CUDA error: invalid device ordinal

Dragonzz27 commented 4 days ago

Hi @MohamedAfham

Have you ever met this bug before? Thanks a lot.

Using GPU : 0 from 1 devices Use Adam Start training epoch: (0/100) /export/home/hanxiaobing/anaconda3/envs/crosspoint/lib/python3.7/site-packages/torch/optim/lr_scheduler.py:134: UserWarning: Detected call of lr_scheduler.step() before optimizer.step(). In PyTorch 1.1.0 and later, you should call them in the opposite order: optimizer.step() before lr_scheduler.step(). Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate "https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate", UserWarning) Traceback (most recent call last): File "train_crosspoint.py", line 261, in train(args, io) File "traincrosspoint.py", line 103, in train , pointfeats, = point_model(data) File "/export/home/hanxiaobing/anaconda3/envs/crosspoint/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, *kwargs) File "/export/home/hanxiaobing/Documents/PlaneNet_PlaneRCNN/DGCNN_PointNet2/SensatUrban/MAE/CrossPoint/models/dgcnn.py", line 95, in forward x = get_graph_feature(x, k=self.k) File "/export/home/hanxiaobing/Documents/PlaneNet_PlaneRCNN/DGCNN_PointNet2/SensatUrban/MAE/CrossPoint/models/dgcnn.py", line 29, in get_graph_feature idx_base = torch.arange(0, batch_size, device=device).view(-1, 1, 1)num_points RuntimeError: CUDA error: invalid device ordinal

I meet the same problem when i want to train crosspoint, Have you solved the problem?

Dragonzz27 commented 2 days ago

Hi @MohamedAfham

Have you ever met this bug before? Thanks a lot.

Using GPU : 0 from 1 devices Use Adam Start training epoch: (0/100) /export/home/hanxiaobing/anaconda3/envs/crosspoint/lib/python3.7/site-packages/torch/optim/lr_scheduler.py:134: UserWarning: Detected call of lr_scheduler.step() before optimizer.step(). In PyTorch 1.1.0 and later, you should call them in the opposite order: optimizer.step() before lr_scheduler.step(). Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate "https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate", UserWarning) Traceback (most recent call last): File "train_crosspoint.py", line 261, in train(args, io) File "traincrosspoint.py", line 103, in train , pointfeats, = point_model(data) File "/export/home/hanxiaobing/anaconda3/envs/crosspoint/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, *kwargs) File "/export/home/hanxiaobing/Documents/PlaneNet_PlaneRCNN/DGCNN_PointNet2/SensatUrban/MAE/CrossPoint/models/dgcnn.py", line 95, in forward x = get_graph_feature(x, k=self.k) File "/export/home/hanxiaobing/Documents/PlaneNet_PlaneRCNN/DGCNN_PointNet2/SensatUrban/MAE/CrossPoint/models/dgcnn.py", line 29, in get_graph_feature idx_base = torch.arange(0, batch_size, device=device).view(-1, 1, 1)num_points RuntimeError: CUDA error: invalid device ordinal

I have solved the problem, change "models/dgcnn.py" line 27 "device = torch.device('cuda:1')" to "device = torch.device('cuda:0')". It is more interesting to find out that "train_crosspoint.py" line 47 "device = torch.device("cuda" if args.cuda else "cpu")", but this code can be only run on one GPU.