apple / ml-mdm

Train high-quality text-to-image diffusion models in a data & compute efficient manner
https://machinelearning.apple.com/research/matryoshka-diffusion-models
MIT License
423 stars 22 forks source link

if I incerease step size more than 50, 256x256 model does not work. #20

Closed Oguzhanercan closed 1 hour ago

Oguzhanercan commented 20 hours ago

Hi, thanks for your work.

In my experiments, there is no problem while using 64x64 model, but while I am using 256x256 model, which requires more than 50 steps, if I set the step size to 50, model works but generates noise like images. When I set it 200 or 1000, I got the error below.

I have tried different torch versions etc. What could be the reason for this error.

../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [309,0,0], thread: [32,0,0] Assertion -sizes[i] <= index && index < sizes[i] && "index out of bounds" failed. ../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [309,0,0], thread: [33,0,0] Assertion -sizes[i] <= index && index < sizes[i] && "index out of bounds" failed. ../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [309,0,0], thread: [34,0,0] Assertion -sizes[i] <= index && index < sizes[i] && "index out of bounds" failed. ../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [309,0,0], thread: [35,0,0] Assertion -sizes[i] <= index && index < sizes[i] && "index out of bounds" failed. ../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [309,0,0], thread: [36,0,0] Assertion -sizes[i] <= index && index < sizes[i] && "index out of bounds" failed. ../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [309,0,0], thread: [37,0,0] Assertion -sizes[i] <= index && index < sizes[i] && "index out of bounds" failed. ../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [309,0,0], thread: [38,0,0] Assertion -sizes[i] <= index && index < sizes[i] && "index out of bounds" failed. ../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [309,0,0], thread: [39,0,0] Assertion -sizes[i] <= index && index < sizes[i] && "index out of bounds" failed. ../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [309,0,0], thread: [40,0,0] Assertion -sizes[i] <= index && index < sizes[i] && "index out of bounds" failed. ../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [309,0,0], thread: [41,0,0] Assertion -sizes[i] <= index && index < sizes[i] && "index out of bounds" failed. ../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [309,0,0], thread: [42,0,0] Assertion -sizes[i] <= index && index < sizes[i] && "index out of bounds" failed. ../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [309,0,0], thread: [43,0,0] Assertion -sizes[i] <= index && index < sizes[i] && "index out of bounds" failed. ../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [309,0,0], thread: [44,0,0] Assertion -sizes[i] <= index && index < sizes[i] && "index out of bounds" failed. ../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [309,0,0], thread: [45,0,0] Assertion -sizes[i] <= index && index < sizes[i] && "index out of bounds" failed. ../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [309,0,0], thread: [46,0,0] Assertion -sizes[i] <= index && index < sizes[i] && "index out of bounds" failed. ../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [309,0,0], thread: [47,0,0] Assertion -sizes[i] <= index && index < sizes[i] && "index out of bounds" failed. ../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [309,0,0], thread: [48,0,0] Assertion -sizes[i] <= index && index < sizes[i] && "index out of bounds" failed. ../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [309,0,0], thread: [49,0,0] Assertion -sizes[i] <= index && index < sizes[i] && "index out of bounds" failed. ../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [309,0,0], thread: [50,0,0] Assertion -sizes[i] <= index && index < sizes[i] && "index out of bounds" failed. ../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [309,0,0], thread: [51,0,0] Assertion -sizes[i] <= index && index < sizes[i] && "index out of bounds" failed. ../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [309,0,0], thread: [52,0,0] Assertion -sizes[i] <= index && index < sizes[i] && "index out of bounds" failed. ../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [309,0,0], thread: [53,0,0] Assertion -sizes[i] <= index && index < sizes[i] && "index out of bounds" failed. ../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [309,0,0], thread: [54,0,0] Assertion -sizes[i] <= index && index < sizes[i] && "index out of bounds" failed. ../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [309,0,0], thread: [55,0,0] Assertion -sizes[i] <= index && index < sizes[i] && "index out of bounds" failed. ../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [309,0,0], thread: [56,0,0] Assertion -sizes[i] <= index && index < sizes[i] && "index out of bounds" failed. ../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [309,0,0], thread: [57,0,0] Assertion -sizes[i] <= index && index < sizes[i] && "index out of bounds" failed. ../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [309,0,0], thread: [58,0,0] Assertion -sizes[i] <= index && index < sizes[i] && "index out of bounds" failed. ../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [309,0,0], thread: [59,0,0] Assertion -sizes[i] <= index && index < sizes[i] && "index out of bounds" failed. ../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [309,0,0], thread: [60,0,0] Assertion -sizes[i] <= index && index < sizes[i] && "index out of bounds" failed. ../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [309,0,0], thread: [61,0,0] Assertion -sizes[i] <= index && index < sizes[i] && "index out of bounds" failed. ../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [309,0,0], thread: [62,0,0] Assertion -sizes[i] <= index && index < sizes[i] && "index out of bounds" failed. ../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [309,0,0], thread: [63,0,0] Assertion -sizes[i] <= index && index < sizes[i] && "index out of bounds" failed. 0%| | 0/1000 [00:00<?, ?it/s] Traceback (most recent call last): File "ml_mdm/clis/generate_sample.py", line 353, in main(args) File "ml_mdm/clis/generate_sample.py", line 325, in main output_image, logsnr_fig, output_text, output_video, run_btn,stop_btn = generate( File "ml_mdm/clis/generate_sample.py", line 223, in generate for step, result in enumerate( File "/home/oguzhan/Desktop/ml-mdm/ml_mdm/samplers.py", line 525, in _sample x0, x_t, extra = self.get_xt_minus_1( File "/home/oguzhan/Desktop/ml-mdm/ml_mdm/samplers.py", line 655, in get_xt_minus_1 p_t = self.forward_model( File "/home/oguzhan/Desktop/ml-mdm/ml_mdm/samplers.py", line 752, in forward_model p_t = model( File "/home/oguzhan/Desktop/ml-mdm/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, kwargs) File "/home/oguzhan/Desktop/ml-mdm/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(*args, *kwargs) File "/home/oguzhan/Desktop/ml-mdm/ml_mdm/diffusion.py", line 251, in forward p_t = self.vision_model(x_t, times, lm_outputs, lm_mask, micros) File "/home/oguzhan/Desktop/ml-mdm/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(args, kwargs) File "/home/oguzhan/Desktop/ml-mdm/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(*args, kwargs) File "/home/oguzhan/Desktop/ml-mdm/ml_mdm/models/unet.py", line 986, in forward return self.forward_denoising( File "/home/oguzhan/Desktop/ml-mdm/ml_mdm/models/nested_unet.py", line 184, in forward_denoising x = self.forward_input_layer( File "/home/oguzhan/Desktop/ml-mdm/ml_mdm/models/unet.py", line 179, in wrapper outs = f(*args, *kwargs) File "/home/oguzhan/Desktop/ml-mdm/ml_mdm/models/unet.py", line 874, in forward_input_layer x = self.conv_in(x_t) File "/home/oguzhan/Desktop/ml-mdm/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(args, kwargs) File "/home/oguzhan/Desktop/ml-mdm/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(*args, **kwargs) File "/home/oguzhan/Desktop/ml-mdm/venv/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 460, in forward return self._conv_forward(input, self.weight, self.bias) File "/home/oguzhan/Desktop/ml-mdm/venv/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 456, in _conv_forward return F.conv2d(input, weight, bias, self.stride, RuntimeError: cuDNN error: CUDNN_STATUS_NOT_INITIALIZED

MultiPath commented 2 hours ago

Sorry for seeing the question late. I have just created a PR for fixing the config loading issue. Can you please try again and see if the same issues happen?

Oguzhanercan commented 1 hour ago

PR fixed the problem. Thanks.