hpcaitech / Open-Sora

Open-Sora: Democratizing Efficient Video Production for All
https://hpcaitech.github.io/Open-Sora/
Apache License 2.0
22k stars 2.15k forks source link

RuntimeError: CUDA error: an illegal memory access was encountered #150

Closed Jianrong-Lu closed 4 months ago

Jianrong-Lu commented 7 months ago

Traceback (most recent call last): File "/home/lu_24/Open-Sora/scripts/inference.py", line 122, in main() File "/home/lu_24/Open-Sora/scripts/inference.py", line 103, in main samples = scheduler.sample( File "/home/lu_24/anaconda3/envs/opensora/lib/python3.10/site-packages/opensora/schedulers/iddpm/init.py", line 72, in sample samples = self.p_sample_loop( File "/home/lu_24/anaconda3/envs/opensora/lib/python3.10/site-packages/opensora/schedulers/iddpm/gaussian_diffusion.py", line 434, in p_sample_loop for sample in self.p_sample_loop_progressive( File "/home/lu_24/anaconda3/envs/opensora/lib/python3.10/site-packages/opensora/schedulers/iddpm/gaussian_diffusion.py", line 485, in p_sample_loop_progressive out = self.p_sample( File "/home/lu_24/anaconda3/envs/opensora/lib/python3.10/site-packages/opensora/schedulers/iddpm/gaussian_diffusion.py", line 388, in p_sample out = self.p_mean_variance( File "/home/lu_24/anaconda3/envs/opensora/lib/python3.10/site-packages/opensora/schedulers/iddpm/respace.py", line 94, in p_mean_variance return super().p_mean_variance(self._wrap_model(model), args, kwargs) File "/home/lu_24/anaconda3/envs/opensora/lib/python3.10/site-packages/opensora/schedulers/iddpm/gaussian_diffusion.py", line 267, in p_mean_variance model_output = model(x, t, model_kwargs) File "/home/lu_24/anaconda3/envs/opensora/lib/python3.10/site-packages/opensora/schedulers/iddpm/respace.py", line 127, in call return self.model(x, new_ts, kwargs) File "/home/lu_24/anaconda3/envs/opensora/lib/python3.10/site-packages/opensora/schedulers/iddpm/init.py", line 89, in forward_with_cfg model_out = model.forward(combined, timestep, y, kwargs) File "/home/lu_24/anaconda3/envs/opensora/lib/python3.10/site-packages/opensora/models/stdit/stdit.py", line 267, in forward x = auto_grad_checkpoint(block, x, y, t0, y_lens, tpe) File "/home/lu_24/anaconda3/envs/opensora/lib/python3.10/site-packages/opensora/acceleration/checkpoint.py", line 24, in auto_grad_checkpoint return module(args, kwargs) File "/home/lu_24/anaconda3/envs/opensora/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "/home/lu_24/anaconda3/envs/opensora/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(args, kwargs) File "/home/lu_24/anaconda3/envs/opensora/lib/python3.10/site-packages/opensora/models/stdit/stdit.py", line 114, in forward x = x + self.drop_path(gate_mlp self.mlp(t2i_modulate(self.norm2(x), shift_mlp, scale_mlp))) File "/home/lu_24/anaconda3/envs/opensora/lib/python3.10/site-packages/opensora/models/layers/blocks.py", line 52, in t2i_modulate return x (1 + scale) + shift RuntimeError: CUDA error: an illegal memory access was encountered CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

powerzbt commented 7 months ago

Ensure your GPU resources are sufficient and compatible, update your CUDA and PyTorch versions, and set CUDA_LAUNCH_BLOCKING=1 in your environment variables for more detailed error tracking.

github-actions[bot] commented 6 months ago

This issue is stale because it has been open for 7 days with no activity.

github-actions[bot] commented 4 months ago

This issue is stale because it has been open for 7 days with no activity.

github-actions[bot] commented 4 months ago

This issue was closed because it has been inactive for 7 days since being marked as stale.

JonathanLi19 commented 3 weeks ago

Hi, I encountered the same problem. Have you solved it?