rupeshs / fastsdcpu

Fast stable diffusion on CPU
MIT License
1k stars 87 forks source link

Improved tiled upscale #128

Closed monstruosoft closed 5 months ago

monstruosoft commented 5 months ago

OK, I guess the changes in this PR require some explanation. First of all, the simplest way to upscale an image is by using the upscale argument and specifying a file at the CLI, for example:

$ python src/app.py --inference_steps 1 --lcm_model_id stabilityai/sd-turbo --use_offline_model -t -f '/home/monstruosoft/fastsdcpu/results/f834fc7f-f613-42b1-aa54-42d0cefb21d3-1.png' --upscale

Now you can also achieve the same result by passing a JSON file argument, like this:

$ python src/app.py --inference_steps 1 --lcm_model_id stabilityai/sd-turbo --use_offline_model -t --upscale --custom_settings upscale.json

The JSON file has the following format:

{
  "source_file": "/home/monstruosoft/fastsdcpu/results/f834fc7f-f613-42b1-aa54-42d0cefb21d3-1.png",
  "target_file": null,
  "target_format": "jpg",
  "tile_size": 256,
  "tile_overlap": 16,
  "scale_factor": 2.0,
  "strength": 0.3,
  "tiles": []
}

An empty tiles array indicates the code to perform the default tiled upscale; however, that is not always enough to obtain a good result, particularly faces might require some extra generation to improve the output, so you can specify custom tile settings in the JSON file, for example:

{
  "source_file": "/home/monstruosoft/fastsdcpu/results/f834fc7f-f613-42b1-aa54-42d0cefb21d3-1.png",
  "target_file": "/home/monstruosoft/fastsdcpu/results/FastSD-1706709089.jpg",
  "target_format": "jpg",
  "tile_size": 256,
  "tile_overlap": 16,
  "scale_factor": 2.0,
  "strength": 0.3,
  "tiles": [{
    "x": 288,
    "y": 62,
    "w": 125,
    "h": 109,
    "scale_factor": 4.0,
    "mask_box": null,
    "prompt": null
  }]
}

This will generate a new tile from the specified region of the source image and paste it on the target image (usually a previously upscaled version of the same source file).

What each option does might seem complicated at first and, if needed, I may post an explanation including images to demonstrate their meaning and how each option affects the output.

Note that, at the moment, the JSON file requires manual editing but the idea is that eventually the settings might get generated dynamically from within the GUI versions of FastSD CPU.

rupeshs commented 5 months ago

@monstruosoft Thanks

rupeshs commented 5 months ago

@monstruosoft It is not working with OpenVINO.

image

monstruosoft commented 5 months ago

Might be due to the arbitrary sizes used for image generation, OpenVINO must have more strict checks for output resolution. Will take a look at it.

rupeshs commented 5 months ago

@monstruosoft Openvino supports image size multiple of 64, I adjusted tile_overlap=32 and reshape=True, works but output resolution is 1024x1152 (Openvino upscale is slow because of this compiling)

        current_tile = context.generate_text_to_image(
            settings=config,
            device=DEVICE,
            reshape=True,
        )[0]
rupeshs commented 5 months ago

@monstruosoft I just added OpenVINO upscale support and EDSR , more details https://github.com/rupeshs/fastsdcpu/discussions/127