Open OrangeSodahub opened 3 months ago
Hi, I followed your instructions of install:
(hifa) $ python -V Python 3.9.19 (hifa) $ pip show torch Name: torch Version: 2.0.0+cu117 Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration Home-page: https://pytorch.org/ Author: PyTorch Team Author-email: packages@pytorch.org License: BSD-3 Location: /home/.../envs/hifa/lib/python3.9/site-packages Requires: filelock, jinja2, networkx, sympy, triton, typing-extensions Required-by: accelerate, carvekit-colab, invisible-watermark, pytorch-lightning, taming-transformers, torch-ema, torchmetrics, torchvision, triton (hifa) $ nvcc -V nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2022 NVIDIA Corporation Built on Wed_Jun__8_16:49:14_PDT_2022 Cuda compilation tools, release 11.7, V11.7.99 Build cuda_11.7.r11.7/compiler.31442593_0
However when I run main.py this error occurs:
main.py
│ 921 │ │ │ │ self.save_checkpoint(full=True, best=False) │ │ │ │ /home/.../HiFA/nerf/utils.py:1192 in train_one_epoch │ │ │ │ 1189 │ │ │ │ │ 1190 │ │ │ # loss.backward() │ │ 1191 │ │ │ start = time.time() │ │ ❱ 1192 │ │ │ self.scaler.scale(loss).backward() │ │ 1193 │ │ │ │ │ 1194 │ │ │ self.post_train_step() │ │ 1195 │ │ │ # self.optimizer.step() │ │ │ │ /home/.../envs/hifa/lib/python3.9/site-packages/torch/_tensor.py:487 in │ │ backward │ │ │ │ 484 │ │ │ │ create_graph=create_graph, │ │ 485 │ │ │ │ inputs=inputs, │ │ 486 │ │ │ ) │ │ ❱ 487 │ │ torch.autograd.backward( │ │ 488 │ │ │ self, gradient, retain_graph, create_graph, inputs=inputs │ │ 489 │ │ ) │ │ 490 │ │ │ │ /home/.../envs/hifa/lib/python3.9/site-packages/torch/autograd/__init__.py: │ │ 200 in backward │ │ │ │ 197 │ # The reason we repeat same the comment below is that │ │ 198 │ # some Python versions print out the first line of a multi-line function │ │ 199 │ # calls in the traceback and some print out the last line │ │ ❱ 200 │ Variable._execution_engine.run_backward( # Calls into the C++ engine to run the bac │ │ 201 │ │ tensors, grad_tensors_, retain_graph, create_graph, inputs, │ │ 202 │ │ allow_unreachable=True, accumulate_grad=True) # Calls into the C++ engine to ru │ │ 203 │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯ RuntimeError: CUDA error: invalid argument Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
https://github.com/JunzheJosephZhu/HiFA/issues/9#issuecomment-2014161812
+1
Hi, I followed your instructions of install:
However when I run
main.py
this error occurs: