Code is not running correctly on my supercomputer:

Code is not running correctly on my supercomputer: /sqfs/work/G15408/v60646/conda_env/nclaw) [v60646@squidhpc3 train]$ python invariant_full_meta-invariant_full_meta.py env: blob: bsdf_pcd: type: diffuse reflectance: type: rgb value:

0.92941176
0.32941176
0.23137255 material: elasticity: cls: InvariantFullMetaElasticity layer_widths:
64
64 norm: null nonlinearity: gelu no_bias: true normalize_input: true requires_grad: true plasticity: cls: InvariantFullMetaPlasticity layer_widths:
64
64 norm: null alpha: 0.001 nonlinearity: gelu no_bias: true normalize_input: true requires_grad: true name: jelly ckpt: null shape: type: cube name: dataset center:
- 0.5
- 0.5
- 0.5 size:
- 0.5
- 0.5
- 0.5 resolution: 10 mode: uniform sort: null vel: random: false lin_vel:
- 1.0
- -1.5
- -2.0 ang_vel:
- 4.0
- 4.0
- 4.0 name: jelly rho: 1000.0 span:
  - 0
  - 1000 clip_bound: 0.5 render: spp: 32 width: 512 height: 512 skip_frame: 25 bound: 1.75 mpm_mul: 6 sph_version: cuda_ad_rgb pcd_version: cuda_ad_rgb has_sphere_emitter: true fps: 10 sim: quality: low num_steps: 1000 gravity:
  - 0.0
  - -9.8
  - 0.0 bc: freeslip num_grids: 20 dt: 0.0005 bound: 3 eps: 1.0e-07 skip_frame: 1 train: teacher: strategy: cosine start_lambda: 25 end_lambda: 200 num_epochs: 300 batch_size: 128 elasticity_lr: 1.0 plasticity_lr: 0.1 elasticity_wd: 0.0 plasticity_wd: 0.0 elasticity_grad_max_norm: 0.1 plasticity_grad_max_norm: 0.1 name: jelly/train/invariant_full_meta-invariant_full_meta seed: 0 cpu: 0 num_cpus: 128 gpu: 0 overwrite: false resume: false

Warp 0.11.0 initialized: CUDA Toolkit: 11.5, Driver: 12.0 Devices: "cpu" | x86_64 "cuda:0" | Quadro RTX 6000 (sm_75) Kernel cache: /sqfs/home/v60646/.cache/warp/0.11.0 target directory (/sqfs2/cmc/1/work/G15408/v60646/github/NCLaw/experiments/log/jelly/train/invariant_full_meta-invariant_full_meta) already exists, overwrite? [Y/r/n] y overwriting directory (/sqfs2/cmc/1/work/G15408/v60646/github/NCLaw/experiments/log/jelly/train/invariant_full_meta-invariant_full_meta) 0%| | 0/1000 [00:00<?, ?it/s]/sqfs/work/G15408/v60646/conda_env/nclaw/lib/python3.10/site-packages/warp/torch.py:159: UserWarning: The .grad attribute of a Tensor that is not a leaf Tensor is being accessed. Its .grad attribute won't be populated during autograd.backward(). If you indeed want the .grad field to be populated for a non-leaf Tensor, use .retain_grad() on the non-leaf Tensor. If you access the non-leaf Tensor by mistake, make sure you access the leaf Tensor instead. See github.com/pytorch/pytorch/pull/30531 for more informations. (Triggered internally at /opt/conda/conda-bld/pytorch_1704987280714/work/build/aten/src/ATen/core/TensorBody.h:489.) if t.grad is None: 0%| | 0/300 [00:04<?, ?it/s] Error executing job with overrides: ['overwrite=False', 'resume=False', 'gpu=0', 'cpu=0', 'env=jelly', 'env/blob/material/elasticity=invariant_full_meta', 'env/blob/material/plasticity=invariant_full_meta', 'env.blob.material.elasticity.requires_grad=True', 'env.blob.material.plasticity.requires_grad=True', 'render=debug', 'sim=low', 'name=jelly/train/invariant_full_meta-invariant_full_meta'] Traceback (most recent call last): File "/sqfs2/cmc/1/work/G15408/v60646/github/NCLaw/experiments/train.py", line 131, in main loss.backward() File "/sqfs/work/G15408/v60646/conda_env/nclaw/lib/python3.10/site-packages/torch/_tensor.py", line 522, in backward torch.autograd.backward( File "/sqfs/work/G15408/v60646/conda_env/nclaw/lib/python3.10/site-packages/torch/autograd/init.py", line 266, in backward Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass File "/sqfs/work/G15408/v60646/conda_env/nclaw/lib/python3.10/site-packages/torch/autograd/function.py", line 289, in apply return user_fn(self, *args) File "/sqfs2/cmc/1/work/G15408/v60646/github/NCLaw/nclaw/sim/interface.py", line 62, in backward model.backward(statics, state_curr, state_next, tape) File "/sqfs2/cmc/1/work/G15408/v60646/github/NCLaw/nclaw/sim/mpm.py", line 313, in backward tape.backward() File "/sqfs/work/G15408/v60646/conda_env/nclaw/lib/python3.10/site-packages/warp/tape.py", line 119, in backward adj_inputs.append(self.get_adjoint(a)) File "/sqfs2/cmc/1/work/G15408/v60646/github/NCLaw/nclaw/warp/tape.py", line 24, in get_adjoint adj = wp.codegen.StructInstance(a.struct) AttributeError: 'NewStructInstance' object has no attribute 'struct'

PingchuanMa / NCLaw

Code is not running correctly on my supercomputer: #1