LaurentMazare / diffusers-rs

An implementation of the diffusers api in Rust
Apache License 2.0
535 stars 55 forks source link

Add height and width flags #52

Closed mspronesti closed 1 year ago

mspronesti commented 1 year ago

This PR aims at supporting height and width flags, solving #31 . I tried to keep your design of the pipelines unchanged: if not specified, the SD pipelines uses the default values previously assigned depending on the chosen version of stable diffusion (512x512 for v1.5, 768x768 for v2.1).

LaurentMazare commented 1 year ago

Looks pretty good thanks! Could you add some samples of the tests that you ran with this?

mspronesti commented 1 year ago

Sure. Here's how I tested it.

Not passing any arg

TORCH_CUDA_VERSION=cu116 cargo run --example stable-diffusion --features clap -- --prompt "A very rusty robot holding a fire torch."  --sd-version=v1-5 --n-steps=50 --seed=1297 

Final size: 512x512

diffusers-rs HF diffusers
sd_final image

Only passing height

!TORCH_CUDA_VERSION=cu116 cargo run --example stable-diffusion --features clap -- --prompt "A very rusty robot holding a fire torch."  --sd-version=v1-5 --n-steps=50 --seed=1297 --height=496

Final size: 512x496

diffusers-rs HF diffusers
image image

Only passing width

!TORCH_CUDA_VERSION=cu116 cargo run --example stable-diffusion --features clap -- --prompt "A very rusty robot holding a fire torch."  --sd-version=v1-5 --n-steps=50 --seed=1297 --width=768

Final size: 768x512

diffusers-rs HF diffusers
sd_final image

Passing both

!TORCH_CUDA_VERSION=cu116 cargo run --example stable-diffusion --features clap -- --prompt "A very rusty robot holding a fire torch."  --sd-version=v1-5 --n-steps=30 --seed=1297 --width=600 --height=600

Final size: 600x600

diffusers-rs HF diffusers
sd_final image
LaurentMazare commented 1 year ago

Nice thanks (somehow I didn't see any images for the diffusers-rs version on the two last size but that might just be github not liking having too many large images).

mspronesti commented 1 year ago

I just edited the last pair using 600 instead of 768. It should be visible now, I hope. Thanks for merging :-)