[Feature Request] - Enhanced Integration of txt2img with Tiled Diffusion & VAE and controlnet Tile

kotaro5487 commented 1 year ago

I propose that we integrate the outputs of our txt2img system with Tiled Diffusion & VAE and ControlNet through automatic patch processing. I believe this combination will enhance image quality and enable larger scale production. Please note, this translation has been performed by an AI, specifically ChatGPT, and there may be some awkward phrasing or potential misunderstandings. I appreciate your understanding in this matter.

kotaro5487 commented 1 year ago

SnapCrab_NoName_2023-5-27_21-32-29_No-00

I have created a desired operation using the webuiapi.

This python script uses the API to send images generated by txt2img directly to mig2img and creates images using Tiled Diffusion, Noise Inversion, Tiled VAE, and ControlNet.

I've implemented this feature, but I would be happier if it was officially implemented.

4xGenerate.py User Manual

Purpose of Creation: I wanted to mass-produce high-quality images using Tiled Diffusion, Noise Inversion, Tiled VAE, and ControlNet. However, there are problems like ControlNet not functioning in txt2img if an image is not input, and Noise Inversion not working. Hence, I was not satisfied with txt2img. Even with img2img, you need to generate the original image first, and it's bothersome to load the images one by one. While you could generate images with txt2img first and then batch process the folder with img2img, it is a time-consuming process. Besides, it is tedious to set up all the settings for Tiled Diffusion, Noise Inversion, Tiled VAE, and ControlNet (although you can change the initial values in uiconfig). So, I created a Python program that directly sends images generated by txt2img to img2img to create enlarged images. It was too cumbersome to input the prompt each time for the extension function, so I made it generate by randomizing the seed when I threw in the images generated in the web-ui. It uses the model files currently being read by the web-ui. If you use a 512x512 image, it should generate a 2048x2048 image in about 2 minutes.

How to Use:

Please add the --api option to the web-ui in advance. Please change the models etc. in advance.

To use, enter something like python 4xGenerate.py image1.png(512×512 VRAM24GB) in the command prompt. If you drop the image onto the command prompt, it should input the path. I have not implemented measures against space characters

Please input an integer from 1-100 for the batch size and count size when asked.

Please install any missing modules with pip.

Here is the main program:

from PIL import Image, PngImagePlugin import copy import pprint from tqdm import tqdm import sys import datetime import os import subprocess import webuiapi

url = "http://127.0.0.1:7860"

def get_user_input(prompt): while True: try: value = int(input(prompt)) if 1 <= value <= 100: return value else: print("入力値が範囲外です。1から100の整数を入力してください。") except ValueError: print("整数を入力してください。")

def set_batch_and_count_sizes(): batch_size = get_user_input("バッチサイズを入力してください(Please enter the batch size.) (1-100): ") count_size = get_user_input("カウントサイズを入力してください(Please enter the count size.) (1-100): ") return batch_size, count_size

def get_image_metadata(image_file): image = Image.open(image_file) metadata = image.info.get('parameters') with open('metadata.txt', 'w') as file: file.write(metadata) with open('metadata.txt', 'r') as file: text = file.read() lines = text.split("\n") prompt = lines[0] negative_prompt = lines[1].replace("Negative prompt: ", "") settings = lines[2].split(", ") payload = {'prompt': prompt, 'negative_prompt': negative_prompt} for setting in settings: key, value = setting.split(": ") key = key.lower() if value.isdigit(): value = int(value) else: try: value = float(value) except ValueError: pass payload[key] = value width, height = map(int, payload.pop('size').split('x')) payload['width'] = width payload['height'] = height payload['seed'] = -1 keys_to_keep = ["enable_hr", "denoising_strength", "firstphase_width", "firstphase_height", "hr_scale", "hr_upscaler", "hr_second_pass_steps", "hr_resize_x", "hr_resize_y", "prompt", "styles", "seed", "subseed", "subseed_strength", "seed_resize_from_h", "seed_resize_from_w", "sampler_name", "batch_size", "n_iter", "steps", "cfg_scale", "width", "height", "restore_faces", "tiling", "do_not_save_samples", "do_not_save_grid", "negative_prompt", "eta", "s_churn", "s_tmax", "s_tmin", "s_noise", "override_settings", "override_settings_restore_afterwards", "script_args", "script_name", "sampler_index"] payload = {k: v for k, v in payload.items() if k in keys_to_keep} return payload

def generate_initial_images(payload, batch_size, countsize): api = webuiapi.WebUIApi() results = [] for in range(count_size): payload['batch_size'] = batch_size result = api.txt2img(**payload) results.append(result) images = [image for result in results for image in result.images] return images

def prepare_high_res_settings(payload): payload_copy = dict(payload) keys_to_keep = ["prompt", "negative_prompt", "width", "height"] keys_to_remove = [key for key in payload_copy.keys() if key not in keys_to_keep] for key in keys_to_remove: del payload_copy[key] unit1 = webuiapi.ControlNetUnit(module='tile_resample', model='control_v11f1e_sd15_tile [a371b31b]', pixel_perfect=True) width = payload['width'] height = payload['height'] Tiled_Diffusion_payload = { "enabled": True, "method": "Mixture of Diffusers", "overwrite_size": True, "keep_input_size": True, "image_width": width, "image_height": height, "tile_width": 96, "tile_height": 96, "overlap": 32, "tile_batch_size": 4, "upscaler_name": "4x-AnimeSharp", "scale_factor": 4, "noise_inverse": True, "noise_inverse_steps": 30, "noise_inverse_retouch": 1, "noise_inverse_renoise_strength": 0, "noise_inverse_renoise_kernel": 64, "control_tensor_cpu": False, "enable_bbox_control": False, "draw_background": False, "causal_layers": False, "bbox_control_states": [] } Tiled_VAE_payload = { "enabled": True, "encoder_tile_size": 2800, "decoder_tile_size": 192, "vae_to_gpu": True, "fast_decoder": True, "fast_encoder": True, "color_fix": False } Tiled_Diffusion_payload_list = list(Tiled_Diffusion_payload.values()) Tiled_VAE_payload_list = list(Tiled_VAE_payload.values()) payload_copy.update({ "images": [], # この部分は後で更新されます "denoising_strength": 0.5, "seed": -1, "steps": 20, "cfg_scale": 5, "controlnet_units": [unit1], "alwayson_scripts": { "Tiled Diffusion": { "args": Tiled_Diffusion_payload_list }, "Tiled VAE": { "args": Tiled_VAE_payload_list } } }) return payload_copy, unit1, Tiled_Diffusion_payload_list, Tiled_VAE_payload_list

def upscale_and_save_images(images, payload_copy, unit1, Tiled_Diffusion_payload_list, Tiled_VAE_payload_list, image_file): api = webuiapi.WebUIApi() pprint.pprint(payload_copy, indent=4, sort_dicts=False) now = datetime.datetime.now() foldername = now.strftime("%Y%m%d%H%M%S") output_folder = os.path.join(os.getcwd(), folder_name) os.makedirs(output_folder, exist_ok=True) total_images = len(images) for i, image in tqdm(enumerate(images), total=total_images, desc='Processing Images'): payload_copy["images"] = [image] result2 = api.img2img(**payload_copy) pnginfo2 = PngImagePlugin.PngInfo() pnginfo2.add_text("parameters", result2.info['infotexts'][0]) now = datetime.datetime.now() datestring = now.strftime("%Y%m%d%H%M%S") new_filename = os.path.basename(image_file).replace(".png", "4x{}.png".format(date_string)) output_path = os.path.join(output_folder, new_filename) result2.image.save(output_path, pnginfo=pnginfo2) subprocess.Popen(r'explorer /select,"{}"'.format(output_folder))

def main(): batch_size, count_size = set_batch_and_count_sizes() print("バッチサイズ(batch size):", batch_size) print("カウントサイズ(count size):", count_size) image_file = sys.argv[1] payload = get_image_metadata(image_file) images = generate_initial_images(payload, batch_size, count_size) payload_copy, unit1, Tiled_Diffusion_payload_list, Tiled_VAE_payload_list = prepare_high_res_settings(payload) upscale_and_save_images(images, payload_copy, unit1, Tiled_Diffusion_payload_list, Tiled_VAE_payload_list, image_file)

if name == "main": main()

wardensc2 commented 1 year ago

Hi koukei46124 can you explain more detail about how to run your file, I already create a python file copy all your code. Now I will run SD after 127.0.0.1:7860 operated then how can I run your script ?

Thank you in advance

kotaro5487 commented 1 year ago

There are several methods available.

One is to create a batch file, and then drag and drop the image onto the batch file or double-click the batch file and then drag and drop the image.

Another method is to use the command prompt and enter "python 4xGenerate.py input.img" as the input.

batch file

@echo off chcp 65001

REM インストールするモジュール名 set MODULE_NAME=webuiapi

REM モジュールがインストールされているかチェック python -c "import %MODULE_NAME%" 2> nul

REM エラーレベルを確認して、インストールが必要かどうか判断 if errorlevel 1 ( echo %MODULE_NAME% モジュールはインストールされていません。インストールを開始します。

REM pipを使ってモジュールをインストール
python -m pip install %MODULE_NAME%

) else ( echo %MODULE_NAME% モジュールは既にインストールされています。 )

REM Get the path of the batch file's directory set "batch_dir=%~dp0"

REM Create the path to 4xGenerate.py set "script_path=%batch_dir%4xGenerate.py"

REM Check if an image file was provided as argument if "%~1"=="" ( REM If no image file was provided, wait for user input set /p "image_path=Enter the path of the image file: " ) else ( REM If an image file was provided as argument, use it set "image_path=%~1" )

cd "%batch_dir%"

REM Run 4xGenerate.py python "%script_path%" "%image_path%"

pause

REM Pause after batch file execution pause

kotaro5487 commented 1 year ago

I have updated the program, and here is the revised version:

from PIL import Image, PngImagePlugin import copy import pprint from tqdm import tqdm import sys import datetime import os import subprocess import webuiapi

url = "http://127.0.0.1:7860"

def get_user_input(prompt): while True: try: value = int(input(prompt)) if 1 <= value <= 100: return value else: print("入力値が範囲外です。1から100の整数を入力してください。") except ValueError: print("整数を入力してください。")

def get_encoder_decoder_sizes(): while True: encoder_size = input("エンコーダのタイルサイズを入力してください。変更しない場合はエンターを押してください。VRAM24GB且つ512×512画像ならそのまま　初期値2800: ") decoder_size = input("デコーダのタイルサイズを入力してください。変更しない場合はエンターを押してください。VRAM24GB且つ512×512画像ならそのまま　初期値192: ")

    if encoder_size == "":
        encoder_size = 2800  # default value
    elif not encoder_size.isdigit() or int(encoder_size) <= 0:
        print("エンコーダのタイルサイズは正の整数でなければなりません。")
        continue

    if decoder_size == "":
        decoder_size = 192  # default value
    elif not decoder_size.isdigit() or int(decoder_size) <= 0:
        print("デコーダのタイルサイズは正の整数でなければなりません。")
        continue

    return int(encoder_size), int(decoder_size)

def set_batch_and_count_sizes(): batch_size = get_user_input("バッチサイズを入力してください(Please enter the batch size.) (1-100): ") count_size = get_user_input("カウントサイズを入力してください(Please enter the count size.) (1-100): ") return batch_size, count_size

def get_image_metadata(image_file): image = Image.open(image_file) metadata = image.info.get('parameters') with open('metadata.txt', 'w') as file: file.write(metadata) with open('metadata.txt', 'r') as file: text = file.read() lines = text.split("\n") prompt = lines[0] negative_prompt = lines[1].replace("Negative prompt: ", "") settings = lines[2].split(", ") payload = {'prompt': prompt, 'negative_prompt': negative_prompt} for setting in settings: key, value = setting.split(": ", 1) key = key.lower() if value.isdigit(): value = int(value) else: try: value = float(value) except ValueError: pass payload[key] = value width, height = map(int, payload.pop('size').split('x')) payload['width'] = width payload['height'] = height payload['seed'] = -1 if "Hires upscale" in payload: payload["hr_scale"] = payload["Hires upscale"] if "Hires upscaler" in payload: payload["hr_upscaler"] = payload["Hires upscaler"] if "hr_scale" in payload and "hr_upscaler" in payload: payload["enable_hr"] = True if "Denoising strength" in payload: payload["denoising_strength"] = payload["Denoising strength"] if "CFG scale" in payload: payload["cfg_scale"] = payload["CFG scale"] keys_to_keep = ["enable_hr", "denoising_strength", "firstphase_width", "firstphase_height", "hr_scale", "hr_upscaler", "hr_second_pass_steps", "hr_resize_x", "hr_resize_y", "prompt", "styles", "seed", "subseed", "subseed_strength", "seed_resize_from_h", "seed_resize_from_w", "sampler_name", "batch_size", "n_iter", "steps", "cfg_scale", "width", "height", "restore_faces", "tiling", "do_not_save_samples", "do_not_save_grid", "negative_prompt", "eta", "s_churn", "s_tmax", "s_tmin", "s_noise", "override_settings", "override_settings_restore_afterwards", "script_args", "script_name", "sampler_index"] payload = {k: v for k, v in payload.items() if k in keys_to_keep} return payload

def generate_initial_images(payload, batch_size, countsize): api = webuiapi.WebUIApi() results = [] for in range(count_size): payload['batch_size'] = batch_size result = api.txt2img(**payload) results.append(result) images = [image for result in results for image in result.images] return images

def prepare_high_res_settings(payload, encoder_size, decoder_size): payload_copy = dict(payload) keys_to_keep = ["prompt", "negative_prompt", "width", "height"] keys_to_remove = [key for key in payload_copy.keys() if key not in keys_to_keep] for key in keys_to_remove: del payload_copy[key] unit1 = webuiapi.ControlNetUnit(module='tile_colorfix', model='control_v11f1e_sd15_tile [a371b31b]', pixel_perfect=True) width = payload['width'] height = payload['height'] Tiled_Diffusion_payload = { "enabled": True, "method": "Mixture of Diffusers", "overwrite_size": True, "keep_input_size": True, "image_width": width, "image_height": height, "tile_width": 96, "tile_height": 96, "overlap": 32, "tile_batch_size": 4, "upscaler_name": "4x-AnimeSharp", "scale_factor": 4, "noise_inverse": True, "noise_inverse_steps": 30, "noise_inverse_retouch": 1, "noise_inverse_renoise_strength": 0, "noise_inverse_renoise_kernel": 64, "control_tensor_cpu": False, "enable_bbox_control": False, "draw_background": False, "causal_layers": False, "bbox_control_states": [] } Tiled_VAE_payload = { "enabled": True, "encoder_tile_size": encoder_size, "decoder_tile_size": decoder_size, "vae_to_gpu": True, "fast_decoder": True, "fast_encoder": True, "color_fix": False } Tiled_Diffusion_payload_list = list(Tiled_Diffusion_payload.values()) Tiled_VAE_payload_list = list(Tiled_VAE_payload.values()) payload_copy.update({ "images": [], # この部分は後で更新されます "denoising_strength": 0.3, "seed": -1, "steps": 20, "cfg_scale": 5, "controlnet_units": [unit1], "alwayson_scripts": { "Tiled Diffusion": { "args": Tiled_Diffusion_payload_list }, "Tiled VAE": { "args": Tiled_VAE_payload_list } } }) return payload_copy, unit1, Tiled_Diffusion_payload_list, Tiled_VAE_payload_list

def upscale_and_save_images(images, payload_copy, unit1, Tiled_Diffusion_payload_list, Tiled_VAE_payload_list, image_file, script_dir): api = webuiapi.WebUIApi() pprint.pprint(payload_copy, indent=4, sort_dicts=False) now = datetime.datetime.now() foldername = now.strftime("%Y%m%d%H%M%S") output_folder = os.path.join(script_dir, folder_name) os.makedirs(output_folder, exist_ok=True) total_images = len(images) for i, image in tqdm(enumerate(images), total=total_images, desc='Processing Images'): payload_copy["images"] = [image] result2 = api.img2img(**payload_copy) pnginfo2 = PngImagePlugin.PngInfo() pnginfo2.add_text("parameters", result2.info['infotexts'][0]) now = datetime.datetime.now() datestring = now.strftime("%Y%m%d%H%M%S") new_filename = os.path.basename(image_file).replace(".png", "4x{}.png".format(date_string)) output_path = os.path.join(output_folder, new_filename) result2.image.save(output_path, pnginfo=pnginfo2) subprocess.Popen(r'explorer /select,"{}"'.format(output_folder))

def main(): batch_size, count_size = set_batch_and_count_sizes() print("バッチサイズ(batch size):", batch_size) print("カウントサイズ(count size):", count_size)

encoder_size, decoder_size = get_encoder_decoder_sizes()
print("エンコーダのタイルサイズ(encoder tile size):", encoder_size)
print("デコーダのタイルサイズ(decoder tile size):", decoder_size)

image_file = sys.argv[1]
script_dir = os.path.dirname(os.path.realpath(__file__))
payload = get_image_metadata(image_file)
images = generate_initial_images(payload, batch_size, count_size)
payload_copy, unit1, Tiled_Diffusion_payload_list, Tiled_VAE_payload_list = prepare_high_res_settings(payload, encoder_size, decoder_size)
upscale_and_save_images(images, payload_copy, unit1, Tiled_Diffusion_payload_list, Tiled_VAE_payload_list, image_file, script_dir)

if name == "main": main()

wardensc2 commented 1 year ago

Thank I will try your code

kotaro5487 commented 1 year ago

I have implemented this feature as an extension. https://github.com/kotaro5487/Hires-Tiled-Diffusion-ControlNet-Tile- I would be very happy if this feature is implemented in the main body.

pkuliyi2015 / multidiffusion-upscaler-for-automatic1111

[Feature Request] - Enhanced Integration of txt2img with Tiled Diffusion & VAE and controlnet Tile #196