Closed morrisalp closed 1 year ago
This sounds like this PyTorch issue that was recently resolved (in the codebase). Perhaps using pipe.enable_vae_slicing()
would work; otherwise we could try to apply the workaround I described there.
I see pipe.enable_vae_slicing()
avoids the issue.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
Describe the bug
Running the Stable Diffusion 2 generation pipeline with fp16, attention slicing and batch size 16 outputs error message: "RuntimeError: upsample_nearest_nhwc only supports output tensors with less than INT_MAX elements". EDIT: I can go up to batch size 14, but for >=15 I receive this error.
Reproduction
Logs
System Info
python 3.8.13, dockerized jupyterlab 3.5.2, diffusers 0.11.0, CUDA 11.8, NVIDAI A5000 GPU
pip freeze output: