Open wangjiajiTHU opened 9 months ago
Sorry for the late reply! Please try to replace the tape.py
file with this file. The problem roots in the incompatibility of warp when upgrading from 0.6.1 to what you were using (0.11.0) from what I saw. Let me know if it helps.
Code is not running correctly on my supercomputer: /sqfs/work/G15408/v60646/conda_env/nclaw) [v60646@squidhpc3 train]$ python invariant_full_meta-invariant_full_meta.py env: blob: bsdf_pcd: type: diffuse reflectance: type: rgb value:
Warp 0.11.0 initialized: CUDA Toolkit: 11.5, Driver: 12.0 Devices: "cpu" | x86_64 "cuda:0" | Quadro RTX 6000 (sm_75) Kernel cache: /sqfs/home/v60646/.cache/warp/0.11.0 target directory (/sqfs2/cmc/1/work/G15408/v60646/github/NCLaw/experiments/log/jelly/train/invariant_full_meta-invariant_full_meta) already exists, overwrite? [Y/r/n] y overwriting directory (/sqfs2/cmc/1/work/G15408/v60646/github/NCLaw/experiments/log/jelly/train/invariant_full_meta-invariant_full_meta) 0%| | 0/1000 [00:00<?, ?it/s]/sqfs/work/G15408/v60646/conda_env/nclaw/lib/python3.10/site-packages/warp/torch.py:159: UserWarning: The .grad attribute of a Tensor that is not a leaf Tensor is being accessed. Its .grad attribute won't be populated during autograd.backward(). If you indeed want the .grad field to be populated for a non-leaf Tensor, use .retain_grad() on the non-leaf Tensor. If you access the non-leaf Tensor by mistake, make sure you access the leaf Tensor instead. See github.com/pytorch/pytorch/pull/30531 for more informations. (Triggered internally at /opt/conda/conda-bld/pytorch_1704987280714/work/build/aten/src/ATen/core/TensorBody.h:489.) if t.grad is None: 0%| | 0/300 [00:04<?, ?it/s] Error executing job with overrides: ['overwrite=False', 'resume=False', 'gpu=0', 'cpu=0', 'env=jelly', 'env/blob/material/elasticity=invariant_full_meta', 'env/blob/material/plasticity=invariant_full_meta', 'env.blob.material.elasticity.requires_grad=True', 'env.blob.material.plasticity.requires_grad=True', 'render=debug', 'sim=low', 'name=jelly/train/invariant_full_meta-invariant_full_meta'] Traceback (most recent call last): File "/sqfs2/cmc/1/work/G15408/v60646/github/NCLaw/experiments/train.py", line 131, in main loss.backward() File "/sqfs/work/G15408/v60646/conda_env/nclaw/lib/python3.10/site-packages/torch/_tensor.py", line 522, in backward torch.autograd.backward( File "/sqfs/work/G15408/v60646/conda_env/nclaw/lib/python3.10/site-packages/torch/autograd/init.py", line 266, in backward Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass File "/sqfs/work/G15408/v60646/conda_env/nclaw/lib/python3.10/site-packages/torch/autograd/function.py", line 289, in apply return user_fn(self, *args) File "/sqfs2/cmc/1/work/G15408/v60646/github/NCLaw/nclaw/sim/interface.py", line 62, in backward model.backward(statics, state_curr, state_next, tape) File "/sqfs2/cmc/1/work/G15408/v60646/github/NCLaw/nclaw/sim/mpm.py", line 313, in backward tape.backward() File "/sqfs/work/G15408/v60646/conda_env/nclaw/lib/python3.10/site-packages/warp/tape.py", line 119, in backward adj_inputs.append(self.get_adjoint(a)) File "/sqfs2/cmc/1/work/G15408/v60646/github/NCLaw/nclaw/warp/tape.py", line 24, in get_adjoint adj = wp.codegen.StructInstance(a.struct) AttributeError: 'NewStructInstance' object has no attribute 'struct'