Closed zhangmxxx closed 3 weeks ago
It turns out that , during 'Prepare image latents' phase of the forward process StableDiffusionCanvasPipeline.__call__()
, cpu_vae
is ignored when calling encode_reference_image()
. The details is as follows:
# line 342 in canvas.py
for region in image2image_regions:
# cpu_vae is not passed to encode_reference_image()
# It will use VRAM by default
region.encode_reference_image(self.vae, device=self.device, generator=generator)
Problem
I was trying to reproduce eyeguided.png in your article, then I encountered "CUDA out of memory" error and the log says the process has more than 20GiB VRAM in use. The "CUDA out of memory" problem disappeared when I remove the guide image from my generation setting.
Generation setting to reproduce the problem
python environment
generation code
Load and preprocess guide image
iic_image = preprocess_image(Image.open("eyeguided_sketch.png").convert("RGB")) # The sketch was taken by a screenshot.
Mixture of Diffusers generation
image = pipeline( canvas_height=2160, canvas_width=3840, regions=[ Text2ImageRegion(0, 480, 0, 640, guidance_scale=8, prompt=f"Abstract decorative illustration, by jackson pollock, elegant, intricate, highly detailed, smooth, sharp focus, vibrant colors, artstation, stunning masterpiece"),
...... 8 * 11 grids
)["sample"][0]