ManLiuCoder / PSVMA

15 stars 4 forks source link

您好,运行您代码时遇见如下问题,劳烦请教您下 #4

Closed hongbo-sun closed 9 months ago

hongbo-sun commented 9 months ago

Warning: multi_tensor_applier fused unscale kernel is unavailable, possibly because apex was installed without --cuda_ext --cpp_ext. Using Python fallback. Original ImportError was: ModuleNotFoundError("No module named 'amp_C'") /home/sunhongbo/open_FG/data/episode_dataset/dataset.py:12: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requiresgrad(True), rather than torch.tensor(sourceTensor). self.labels = torch.tensor(labels).long() /home/sunhongbo/open_FG/data/test_dataset.py:11: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requiresgrad(True), rather than torch.tensor(sourceTensor). self.labels = torch.tensor(labels).long() /home/sunhongbo/anaconda3/envs/openfg/lib/python3.9/site-packages/torch/optim/lr_scheduler.py:131: UserWarning: Detected call of lr_scheduler.step() before optimizer.step(). In PyTorch 1.1.0 and later, you should call them in the opposite order: optimizer.step() before lr_scheduler.step(). Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate warnings.warn("Detected call of lr_scheduler.step() before optimizer.step(). " /home/sunhongbo/anaconda3/envs/openfg/lib/python3.9/site-packages/torch/nn/functional.py:1944: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead. warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.") Traceback (most recent call last): File "/home/sunhongbo/open_FG/train.py", line 123, in main() File "/home/sunhongbo/open_FG/train.py", line 118, in main model = train_model(cfg, args.local_rank, args.distributed) File "/home/sunhongbo/open_FG/train.py", line 58, in train_model do_train( File "/home/sunhongbo/open_FG/models/engine/trainer.py", line 97, in do_train scaled_losses.backward() File "/home/sunhongbo/anaconda3/envs/openfg/lib/python3.9/site-packages/torch/_tensor.py", line 363, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs) File "/home/sunhongbo/anaconda3/envs/openfg/lib/python3.9/site-packages/torch/autograd/init.py", line 173, in backward Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [32, 196, 312]], which is output 0 of ReluBackward0, is at version 1; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

MeycL commented 3 months ago

I also encountered the same problem. May I ask how you resolved it?