Closed Jianrong-Lu closed 4 months ago
Ensure your GPU resources are sufficient and compatible, update your CUDA and PyTorch versions, and set CUDA_LAUNCH_BLOCKING=1 in your environment variables for more detailed error tracking.
This issue is stale because it has been open for 7 days with no activity.
This issue is stale because it has been open for 7 days with no activity.
This issue was closed because it has been inactive for 7 days since being marked as stale.
Hi, I encountered the same problem. Have you solved it?
Traceback (most recent call last): File "/home/lu_24/Open-Sora/scripts/inference.py", line 122, in
main()
File "/home/lu_24/Open-Sora/scripts/inference.py", line 103, in main
samples = scheduler.sample(
File "/home/lu_24/anaconda3/envs/opensora/lib/python3.10/site-packages/opensora/schedulers/iddpm/init.py", line 72, in sample
samples = self.p_sample_loop(
File "/home/lu_24/anaconda3/envs/opensora/lib/python3.10/site-packages/opensora/schedulers/iddpm/gaussian_diffusion.py", line 434, in p_sample_loop
for sample in self.p_sample_loop_progressive(
File "/home/lu_24/anaconda3/envs/opensora/lib/python3.10/site-packages/opensora/schedulers/iddpm/gaussian_diffusion.py", line 485, in p_sample_loop_progressive
out = self.p_sample(
File "/home/lu_24/anaconda3/envs/opensora/lib/python3.10/site-packages/opensora/schedulers/iddpm/gaussian_diffusion.py", line 388, in p_sample
out = self.p_mean_variance(
File "/home/lu_24/anaconda3/envs/opensora/lib/python3.10/site-packages/opensora/schedulers/iddpm/respace.py", line 94, in p_mean_variance
return super().p_mean_variance(self._wrap_model(model), args, kwargs)
File "/home/lu_24/anaconda3/envs/opensora/lib/python3.10/site-packages/opensora/schedulers/iddpm/gaussian_diffusion.py", line 267, in p_mean_variance
model_output = model(x, t, model_kwargs)
File "/home/lu_24/anaconda3/envs/opensora/lib/python3.10/site-packages/opensora/schedulers/iddpm/respace.py", line 127, in call
return self.model(x, new_ts, kwargs)
File "/home/lu_24/anaconda3/envs/opensora/lib/python3.10/site-packages/opensora/schedulers/iddpm/init.py", line 89, in forward_with_cfg
model_out = model.forward(combined, timestep, y, kwargs)
File "/home/lu_24/anaconda3/envs/opensora/lib/python3.10/site-packages/opensora/models/stdit/stdit.py", line 267, in forward
x = auto_grad_checkpoint(block, x, y, t0, y_lens, tpe)
File "/home/lu_24/anaconda3/envs/opensora/lib/python3.10/site-packages/opensora/acceleration/checkpoint.py", line 24, in auto_grad_checkpoint
return module(args, kwargs)
File "/home/lu_24/anaconda3/envs/opensora/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, *kwargs)
File "/home/lu_24/anaconda3/envs/opensora/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(args, kwargs)
File "/home/lu_24/anaconda3/envs/opensora/lib/python3.10/site-packages/opensora/models/stdit/stdit.py", line 114, in forward
x = x + self.drop_path(gate_mlp self.mlp(t2i_modulate(self.norm2(x), shift_mlp, scale_mlp)))
File "/home/lu_24/anaconda3/envs/opensora/lib/python3.10/site-packages/opensora/models/layers/blocks.py", line 52, in t2i_modulate
return x (1 + scale) + shift
RuntimeError: CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with
TORCH_USE_CUDA_DSA
to enable device-side assertions.