Closed maosuli closed 1 year ago
Same with @LuZaiJiaoXiaL here. @LuZaiJiaoXiaL Did you solve the problem by the way?
what is your pytorch / cuda version?
@loicland recently tried on torch 1.12.1 & 1.13.1 + cu113, 117 I may need to change the version of it.
@gihunsong I have the same problem. Did you solve the problem by the way?
Hi!
We are releasing a new version of SuperPoint Graph called SuperPoint Transformer (SPT). It is better in any way:
✨ SPT in numbers ✨ |
---|
📊 SOTA results: 76.0 mIoU S3DIS 6-Fold, 63.5 mIoU on KITTI-360 Val, 79.6 mIoU on DALES |
🦋 212k parameters only! |
⚡ Trains on S3DIS in 3h on 1 GPU |
⚡ Preprocessing is x7 faster than SPG! |
🚀 Easy install (no more boost!) |
If you are interested in lightweight, high-performance 3D deep learning, you should check it out. In the meantime, we will finally retire SPG and stop maintaining this repo.
Hello. Thanks for the excellent work.
I found a problem when I trained the model on the S3DIS dataset with pytorch 1.9.1 + cuda 11.1.
An error occured during the bakward process. It seems one of the variables has been modified before the gradient calculation.
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [12806, 32, 32]], which is output 0 of AsStridedBackward0, is at version 1; expected version 0 instead.
The same code could be used within the environment of pytorch 1.3 plus cuda 10.1 without the mentioned error.
But maybe it is because of the incomplete error triggering mechanism? Inplace operation may lead to a wrong gradient calculation.
Detailed error info is as follows.
[W python_anomaly_mode.cpp:104] Warning: Error detected in GraphConvFunctionBackward. Traceback of forward call that caused the error: File "/model/SuperPointGraph1/superpoint_graph-ssp-spg/learning/main.py", line 461, in
main()
File "/model/SuperPointGraph1/superpoint_graph-ssp-spg/learning/main.py", line 331, in main
acc, loss, oacc, avg_iou = train()
File "/model/SuperPointGraph1/superpoint_graph-ssp-spg/learning/main.py", line 205, in train
outputs = model.ecc(embeddings)
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, *kwargs)
File "/model/SuperPointGraph1/superpoint_graph-ssp-spg/learning/../learning/graphnet.py", line 97, in forward
input = module(input)
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(input, *kwargs)
File "/model/SuperPointGraph1/superpoint_graph-ssp-spg/learning/../learning/modules.py", line 176, in forward
self._edge_mem_limit)
(function _print_stack)
0%| | 0/78 [00:07<?, ?it/s]
Traceback (most recent call last):
File "/model/SuperPointGraph1/superpoint_graph-ssp-spg/learning/main.py", line 461, in
main()
File "/model/SuperPointGraph1/superpoint_graph-ssp-spg/learning/main.py", line 331, in main
acc, loss, oacc, avg_iou = train()
File "/model/SuperPointGraph1/superpoint_graph-ssp-spg/learning/main.py", line 209, in train
loss.backward()
File "/opt/conda/lib/python3.7/site-packages/torch/_tensor.py", line 307, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File "/opt/conda/lib/python3.7/site-packages/torch/autograd/init.py", line 156, in backward
allow_unreachable=True, accumulate_grad=True) # allow_unreachable flag
File "/opt/conda/lib/python3.7/site-packages/torch/autograd/function.py", line 199, in apply
return user_fn(self, args)
File "/model/SuperPointGraph1/superpoint_graph-ssp-spg/learning/../learning/ecc/GraphConvModule.py", line 98, in backward
input, weights = ctx.saved_tensors
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [12806, 32, 32]], which is output 0 of AsStridedBackward0, is at version 1; expected version 0 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later.
I appreciate you can give me some suggestions and help.
Thanks,
Eric.