Open Gitterman69 opened 1 year ago
This isn't available when using 🤗Diffusers pipelines; you have to run as indicated in Run The Code Locally. Notice the args to dream()
in the "I. Dream" section - you can add e.g. aspect_ratio='3:2'
there.
IIUC, there's no way to specify custom resolutions per se - the resolutions are hard-coded(maybe?), but the aspect ratio can be varied.
i tried to generate custom aspect ratio with my code below.... stage 3 gets me OOM eventhough im using a 3090.... maybe you guys can try to run it as well? it would be great to find out how exactly to get the custom Aspect Ratio running
from deepfloyd_if.modules import IFStageI, IFStageII, StableStageIII
from deepfloyd_if.modules.t5 import T5Embedder
from deepfloyd_if.pipelines import dream
device = 'cuda:0'
print("Starting IF Stage I...")
if_I = IFStageI('IF-I-XL-v1.0', device=device)
print("IF Stage I completed.")
print("Starting IF Stage II...")
if_II = IFStageII('IF-II-L-v1.0', device=device)
print("IF Stage II completed.")
print("Starting Stable Stage III...")
if_III = StableStageIII('stable-diffusion-x4-upscaler', device=device)
print("Stable Stage III completed.")
print("Initializing T5 Embedder...")
t5 = T5Embedder(device="cpu")
print("T5 Embedder initialized.")
prompt = 'ultra close-up color photo portrait of rainbow owl with deer horns in the woods'
count = 1
print("Starting dream pipeline...")
result = dream(
t5=t5, if_I=if_I, if_II=if_II, if_III=if_III,
prompt=[prompt]*count,
seed=42,
if_I_kwargs={
"guidance_scale": 7.0,
"sample_timestep_respacing": "smart100",
"aspect_ratio": "3:2",
},
if_II_kwargs={
"guidance_scale": 4.0,
"sample_timestep_respacing": "smart50",
#"aspect_ratio": "3:2",
},
if_III_kwargs={
"guidance_scale": 9.0,
"noise_level": 20,
"sample_timestep_respacing": "75",
#"aspect_ratio": "3:2",
},
)
print("Dream pipeline completed.")
if_III.show(result['III'], size=14)
I haven't tried the dream
pipeline yet. I have been able to provide a width
argument to a stage I pipeline (using both the base DiffusionPipeline and the IFPipeline classes) successfully, but haven't succeeded for subsequent stage pipelines.
i tried to generate custom aspect ratio with my code below.... stage 3 gets me OOM eventhough im using a 3090.... maybe you guys can try to run it as well? it would be great to find out how exactly to get the custom Aspect Ratio running
I also have a 3090 and have been testing with an identical model configuration, albeit using the DiffusionPipeline APIs. With the default resolution, the maximum GPU memory utilized during processing is over 20gb, so I would expect it very possible that an aspect ratio change (particularly 3:2) would put it over the 24gb available to the 3090. You can try 4:3, 5:4, etc. and see if that squeaks by.
66 claims that it's possible to run inference with IF using only 6G VRAM, but I have not tested it myself
You might be able to perform 2-stage pipelines on 6gb using the IF-I-M
and IF-II-M
models. The poster is using IF-I-XL
and IF-II-L
and a third scaling stage instead. They could certainly try again with smaller models.
Good news -- I was able to render a 1536x1024 image via Diffusers and the IF-I-XL
-> IF-II-L
-> 4x scaler
pipeline configuration. This was done on an RTX 3090 -- memory usage just squeaked by at 23031MiB
during the final scaling phase. I needed to make the following simple change to diffusers:
https://github.com/waffletower/diffusers/commit/035b010fa9e696ad35ccd54b2576571fefed39b8
and provide correct dimension values (width
in my case) for the first two stages:
width=96
width=384
respectively.
The pipeline configuration:
import sys
from diffusers import DiffusionPipeline, IFPipeline, IFSuperResolutionPipeline
from diffusers.utils import pt_to_pil
import torch
import numpy as np
# stage 1
stage_1 = IFPipeline.from_pretrained("DeepFloyd/IF-I-XL-v1.0", variant="fp16", torch_dtype=torch.float16)
stage_1.enable_model_cpu_offload()
# stage 2
stage_2 = IFSuperResolutionPipeline.from_pretrained("DeepFloyd/IF-II-L-v1.0", text_encoder=None, variant="fp16",
torch_dtype=torch.float16)
stage_2.enable_model_cpu_offload()
# stage 3
safety_modules = {"feature_extractor": stage_1.feature_extractor, "safety_checker": stage_1.safety_checker, "wate\
rmarker": stage_1.watermarker}
stage_3 = DiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-x4-upscaler", **safety_modules, torch_dt\
ype=torch.float16)
stage_3.enable_model_cpu_offload()
And the invocation:
prompt = 'Jennifer Aniston throwing her shoe at Tucker Carlson'
# text embeds
prompt_embeds, negative_embeds = stage_1.encode_prompt(prompt)
base_seed = np.random.randint(0, sys.maxsize)
for x in range(1):
generator = torch.manual_seed(base_seed + x)
image = stage_1(prompt_embeds=prompt_embeds,
negative_prompt_embeds=negative_embeds,
generator=generator,
output_type="pt",
width=96).images
pt_to_pil(image)[0].save("./if_stage_I.png")
image = stage_2(image=image,
prompt_embeds=prompt_embeds,
negative_prompt_embeds=negative_embeds,
generator=generator,
output_type="pt",
width=384).images
pt_to_pil(image)[0].save("./if_stage_II.png")
image = stage_3(prompt=prompt,
image=image,
generator=generator,
noise_level=100).images
image[0].save(f"{base_seed + x}.png")
I think an aspect ratio argument is preferable, but that can be easily built in the calling code, and can coordinate the differences between the pipeline stages.
Hello, I was able to replicate the change in aspect ratio for stage_1 but stage_2 complains about and unknown argument width
IFSuperResolutionPipeline.call() got an unexpected keyword argument 'width'
I have deepfloyd if 1.0.1
If I don't specify with for stage_2, I get an image with the correct aspect ratio for stage 1 but stage 2 squishes everything in a 256*256 image
Edit My bad, I just noticed the link to the modification in src/diffusers/pipelines/deepfloyd_if/pipeline_if_superresolution.py
will give this a try!
Edit 2 Works fine! :)
cheers!
what custom resolutions are currently supported? 1920x1024 works whereas 1920x1080 doesnt... super strange? any ideas???
I only do the first 2 stages as the SD upscaler doesn't work on my install right now.
For the first stage anything in the range 8080 pixels and above starts to generate strange images so that would be like 12801280 after 4x * 4x
I love the new model but certainly miss custom resolution and aspect ratios…. Any way to do it yet????