作者您好,非常感谢你们的贡献!
traceback (most recent call last):
File "/root/MVSFormer/train_m.py", line 238, in
main(0,args, config)
File "/root/MVSFormer/train_m.py", line 190, in main
trainer.train()
File "/root/MVSFormer/base/base_trainer.py", line 78, in train
result = self._train_epoch(epoch)
File "/root/MVSFormer/trainer/mvsformer_trainer.py", line 141, in _train_epoch
loss.backward()
File "/root/miniconda3/envs/mvsformer/lib/python3.10/site-packages/torch/_tensor.py", line 521, in backward
torch.autograd.backward(
File "/root/miniconda3/envs/mvsformer/lib/python3.10/site-packages/torch/autograd/init.py", line 289, in backward
_engine_run_backward(
File "/root/miniconda3/envs/mvsformer/lib/python3.10/site-packages/torch/autograd/graph.py", line 768, in _engine_run_backward
return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
File "/root/miniconda3/envs/mvsformer/lib/python3.10/site-packages/torch/autograd/function.py", line 306, in apply
return user_fn(self, *args)
File "/root/mmcv/mmcv/ops/carafe.py", line 180, in backward
ext_module.carafe_backward(
RuntimeError: CUDA error: invalid configuration argument
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.
输入为四个尺度的特征,能运行几下,几下过后就报上述错误。参照mmcv官方文档类似问题:
RuntimeError: CUDA error: invalid configuration argument"
This error may be caused by the poor performance of GPU. Try to decrease the value of [THREADS_PER_BLOCK]
修改[THREADS_PER_BLOCK]后重编译mmcv,仍然报错,使用GPU为L20,py3.10,torch2.4.0(原项目版本要求)+cu12.1,mmcv2.2.0请问作者您有没有什么建议?
作者您好,非常感谢你们的贡献! traceback (most recent call last): File "/root/MVSFormer/train_m.py", line 238, in
main(0,args, config)
File "/root/MVSFormer/train_m.py", line 190, in main
trainer.train()
File "/root/MVSFormer/base/base_trainer.py", line 78, in train
result = self._train_epoch(epoch)
File "/root/MVSFormer/trainer/mvsformer_trainer.py", line 141, in _train_epoch
loss.backward()
File "/root/miniconda3/envs/mvsformer/lib/python3.10/site-packages/torch/_tensor.py", line 521, in backward
torch.autograd.backward(
File "/root/miniconda3/envs/mvsformer/lib/python3.10/site-packages/torch/autograd/init.py", line 289, in backward
_engine_run_backward(
File "/root/miniconda3/envs/mvsformer/lib/python3.10/site-packages/torch/autograd/graph.py", line 768, in _engine_run_backward
return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
File "/root/miniconda3/envs/mvsformer/lib/python3.10/site-packages/torch/autograd/function.py", line 306, in apply
return user_fn(self, *args)
File "/root/mmcv/mmcv/ops/carafe.py", line 180, in backward
ext_module.carafe_backward(
RuntimeError: CUDA error: invalid configuration argument
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with
TORCH_USE_CUDA_DSA
to enable device-side assertions.输入为四个尺度的特征,能运行几下,几下过后就报上述错误。参照mmcv官方文档类似问题: RuntimeError: CUDA error: invalid configuration argument" This error may be caused by the poor performance of GPU. Try to decrease the value of [THREADS_PER_BLOCK] 修改[THREADS_PER_BLOCK]后重编译mmcv,仍然报错,使用GPU为L20,py3.10,torch2.4.0(原项目版本要求)+cu12.1,mmcv2.2.0请问作者您有没有什么建议?