rupeshs / fastsdcpu

Fast stable diffusion on CPU
MIT License
1k stars 87 forks source link

Add CLI image variations #121

Closed monstruosoft closed 5 months ago

monstruosoft commented 5 months ago

Add CLI image variations mode by setting the img2img argument without writing a prompt.

This commit also defines two new CLI arguments: strength, used to define the image variation denoising strength. _batchcount, used to define the number of times that image generation will be run; on low end PCs with low RAM, it might be better to generate, for example, 4 batches of 1 image than 1 batch of 4 images. Total number of generated images should be _batch_count * number_ofimages.

monstruosoft commented 5 months ago

I have now added to the CLI branch a basic tiled upscale option that upscales the input image using the image variations mode. Please note that it's currently hardcoded to perform a 2x upscale using 512x512 output tiles. Also note that, since there's no guarantee for image variations tiles to be consistent, it's recommended to use a low denoising strength, the default value of 0.3 might be a good start.

rupeshs commented 5 months ago

@monstruosoft Nice work, I will review it later

rupeshs commented 5 months ago

@monstruosoft could you please share some inputs/outputs of tiled upscale?

monstruosoft commented 5 months ago

Sure, here's a couple of, somewhat cherrypicked, examples:

Original: ffe1606f-7eb9-4c40-997b-afe8efd484b3-1 Upscaled: fastSD-1705961756

Original: 08ae4812-7959-4763-b273-a5b6e14f0cbc Upscaled: fastSD-1705950397

The upscaled versions were done using SD Turbo 1 step inference for each tile. Note that this is just a basic attempt at doing tiled SD upscale using FastSD CPU and it can be improved; for example, the black bars in the images are caused by the transparency used to try and make the tile seams less noticeable but those black bars can be removed relatively easy. Also, I didn't want to mess with the actual image generation code so currently the individual tiles are saved as well as the final images. Also, output path is currently hardcoded as results.

rupeshs commented 5 months ago

@monstruosoft Yes, there is room for improvement. We can keep this as experimental.