RupertAvery / DiffusionToolkit

Metadata-indexer and Viewer for AI-generated images
MIT License
756 stars 46 forks source link

[Enhancement] ComfyUI - Support SDXL Workflow #134

Open Katmoget opened 1 year ago

Katmoget commented 1 year ago

Describe the bug folder is scanned but ComfyUI files are not loaded in DiffusionToolkit if I copy a webui1111 file in my ComfyUI output folder it is recognised by DiffusionToolkit if I copy a ComfyUI file in my webui1111 output folder it's not recognised

Version: What version of Diffusion Toolkit are you using? Look at Help > About or see version.txt v1.2.1

To Reproduce add ComfyUI output folder to the Diffusion folder in Diffusiontoolkit settings, select watch + recursive, create an image with ComfyUI, scan folder for new images.

Expected behavior new images to be added automatically or a least when scan folder for new images icon is pressed.

Source Image all new images created with ComfyUI

Additional context what I tried : to uninstall / reinstall diffusion toolkit ( delete and rebuilt DB) uninstall / reinstall ComfyUI. current version installed : 834ab27 Uninstall / reinstall alternatively .NET 6 and 7 and uninstall / reinstall diffusion toolkit with new version every time ( delete and rebuilt DB) current version installed : MS .Net SDK 7.0306

RupertAvery commented 1 year ago

Thank you.

Please upload the images here so I can test them.

If they are NSFW please upload them somewhere that does not strip the metadata, such as google drive.

azureprophet commented 1 year ago

I have the same problem. Here is a file you can use to test it. complacent 20230801235141_00001_

RupertAvery commented 1 year ago

Thanks @azureprophet.

This looks to be caused by the new SDXL workflow.

Some things have changed and as I don't use ComfyUI, I'm not sure how to process the nodes properly.

There are two nodes of class_type : "KSamplerAdvanced".

If I see KSamplerAdvanced I'll assume it's an SDXL and the code path will be adjusted from the old one.

They look the same except for add_noise and cfg .

I'll take the first one I see, since this is where I get some settings like steps, cfg, and where I look for the positive and negative prompt nodes.

However, the cfg values are conflicting. Which one is the one that should be indexed?

For the prompts, I'll take the text_g value. Dunno what I should do with the text_l value. I only store one prompt value since that is what regular SD does for now.

The new workflow seems to be missing the seed property. Is this now the noise_seed?

Katmoget commented 1 year ago

yes it seems to be the sdxl workflow. I tried with the default ComfyUi workflow with SD 1.5 and SDXL and it worked, diffusionToolkit displayed both. If I use the new workflow regardless of the model I load it doesn't get picked up. the default workflow only load one model the default SDXL workflow load both the base and the refiner models

ComfyUi default workflow: ComfyUI_01320_

ComfyUi default SDXL workflow: ComfyUI_01321_

Katmoget commented 1 year ago

noise_seed is the seed. one of the KSamplerAdvanced is for the base model, the second is for the refiner, for the CFG I think the first one (base model ) is the most important, the second can be pretty important too but If we can only pick one I would say the base one, same thing for the prompt. it's simple enough for us to load the file in ComfyUI to get the whole config and given the complexity of some of the workflow I understand it might not be easy to pull everything in DiffusionToolkit , but of course would be super handy to have the prompts, models, config etc in it. the default ComfyUI sdxl model are from Comfy : https://comfyanonymous.github.io/ComfyUI_examples/sdxl/

azureprophet commented 1 year ago

The prompts are listed in the embedded json files that it incorporates into the pngs as strings labeled “text,” if that helps. I have been extracting them with a png chunks viewer.

On Wed, Aug 2, 2023 at 11:39 AM Katmoget @.***> wrote:

noise_seed is the seed. one of the KSamplerAdvanced is for the base model, the second one loads the refiner, for the CFG I think the first one (base model ) is the most important, the second can be pretty important too but If we can only pick one I would say the base one, same thing for the prompt. it's simple enough for us to load the file in ComfyUI to get the whole config and given the complexity of some of the workflow I understand it might not be easy to pull everything in DiffusionToolkit , but of course would be supper handy to have the prompts, models, config etc in it. the default ComfyUI sdxl model are from Comfy : https://comfyanonymous.github.io/ComfyUI_examples/sdxl/

— Reply to this email directly, view it on GitHub https://github.com/RupertAvery/DiffusionToolkit/issues/134#issuecomment-1662757246, or unsubscribe https://github.com/notifications/unsubscribe-auth/A3R5YUHTWNQLID2XGW4BUM3XTKNE7ANCNFSM6AAAAAA3AA7WHE . You are receiving this because you were mentioned.Message ID: @.***>

azureprophet commented 1 year ago

The SDXL workflow has up to 4 prompts on its most basic level (two positive, two negative). Even if you just took all of the text_* sections and concatenated them together with a spacer it would be easy to figure it out as a stopgap. Thanks for helping!

receyuki commented 1 year ago

Hi, @RupertAvery I'm the author of the SD Prompt Reader. It seems like we are facing the same issues. I've been experimenting with ComfyUI recently and have some insights I'd like to share with you.

  1. Both KSamplerAdvanced and KSampler are basic nodes that can be used for any model. The approach I'm currently using to identify SDXL is to detect the presence of CLIPTextEncodeSDXL and CLIPTextEncodeSDXLRefiner. However, there are some exceptional cases, like the puppy image above, which is SDXL but using the basic CLIPTextEncode and only uses one set of prompts. In such situations, I would treat it as an SD image for processing.

  2. About step and cfg, typically, the first set of data is basic information, while the second set of data is for hires-fix or the Refiner in SDXL. However, I haven't found a effective approach to distinguishing between the two sets of data. As a result, I've decided to display both sets of data.

  3. The seed in KSampler and the noise_seed in KSamplerAdvanced are exactly the same thing.

  4. About SDXL prompts, if I understand correctly, there can be a maximum of six prompts for SDXL images. This includes Clip G, Clip L, and Refiner prompt for both positive and negative .

  5. Both Clip G and Clip L are equally important for SDXL users, so they should both be stored, while the Refiner is optional. There are some discussions on Reddit about the two clips that you can take a look at.

https://www.reddit.com/r/comfyui/comments/15g9pu7/what_exactly_is_text_g_and_text_l_for_sdxl/ https://www.reddit.com/r/StableDiffusion/comments/15ggn9w/sdxl_mini_study_clip_g_vs_clip_l_best_prompting/