Open jerrymatjila opened 2 months ago
if having enough vram gpu, try to comment this line pipe.enable_sequential_cpu_offload()
@jerrymatjila, I'm having the same issue with my 3090 card. Were you able to fix it? Thx
The 24GB VRAM should just be enough to keep the transformer model fully in VRAM, that means you can use pipe.enable_model_cpu_offload()
instead of pipe.enable_sequential_cpu_offload()
. Maybe you don't even need the vae slicing/tiling.
I.e.:
pipe = FluxPipeline.from_pretrained(args.model, torch_dtype=torch.bfloat16)
pipe.enable_model_cpu_offload() # save some VRAM by offloading the model to CPU. Remove this if you have enough GPU power
prompt = args.prompt
image = pipe(
prompt,
height=args.height,
width=args.width,
guidance_scale=0.0,
num_inference_steps=args.num_inference_steps,
max_sequence_length=512,
generator=torch.Generator("cpu").manual_seed(0)
).images[0]
Path(args.output).parent.mkdir(parents=True, exist_ok=True)
image.save(args.output)
If that still uses too more VRAM than available (see task manager), you can look into quantizing the model.
black-forest-labs/FLUX.1-dev
runs very slow. it takes about 15min to generate 1344x768 (wxh) image. Has anyone experienced the same or is it just me.args.num_inference_steps=50