zanllp / sd-webui-infinite-image-browsing

A fast and powerful image/video browser for Stable Diffusion webui / ComfyUI / Fooocus / NovelAI / StableSwarmUI, featuring infinite scrolling and advanced search capabilities using image parameters. It also supports standalone operation.
MIT License
1.03k stars 126 forks source link

New undetected data from ComfyUI generated image #756

Open ermanitu opened 1 week ago

ermanitu commented 1 week ago

I am aware that it is very difficult to extract the data from any image generated by ComfyUI, but I have detected another image in which it does not read them correctly. If you find it useful, I have the python code to extract the prompt from JSON from the workflow that goes inside the image. Basically you have to go through the nodes until you find the CLIPTextEncode or CLIPTextEncodeFlux type and, in case there are several, it is usual that the one with the smallest id is the positive prompt.

I also attach the python code with which I extract these contents from JSON.

I love your application. Thank you.

def get_prompt (self): # get the prompts in the workflow for node in self.workflow.values(): if node["class_type"] == "CLIPTextEncode": # sd style if 'inputs' in node and 'text' in node['inputs']: print(node['inputs']['text']) elif node["class_type"] == "CLIPTextEncodeFlux": # flux style if 'inputs' in node and 't5xxl' in node['inputs']: print(node['inputs']['t5xxl']) ` Imagen PNG-C27F9B89F6FB-1

zanllp commented 6 days ago

Update to the latest IIB

ermanitu commented 6 days ago

Sorry, problem persist.

zanllp commented 5 days ago

How did you save this image from ComfyUI? The information in this image is missing a field with the key 'workflow', so it didn't go through the IIB ComfyUI parser, which caused this exception.

ermanitu commented 5 days ago

No, is it's my own program using the comfyUI API. I think the image returned by the API is missing some data. ComfyUI open those images without problems. I will investigate. Thanks.

zanllp commented 5 days ago
image
ermanitu commented 5 days ago

Could you send me the details of 'prompt' and 'workflow' values? I am guessing that the problem can be the order of the nodes. I will generate another image ordering the nodes and test in IB.

ermanitu commented 5 days ago

I sorted the JSON keys but the problem persists. I think you must use the value of 'prompt', not the value of 'workflow'. You can simply ignore the 'workflow' value if not exists. I'm looking for why ComfyUI API do not provide that.

zanllp commented 5 days ago

'workflow' is only used to determine whether the image was generated by ComfyUI; in fact, it is still 'prompt' during parsing.

https://github.com/zanllp/sd-webui-infinite-image-browsing/blob/1a0f07125845609dfe781492d68b34b81119f47f/scripts/iib/tool.py#L396-L403

ermanitu commented 5 days ago

Ok, I will try then embeed a workflow in the image to resolve the problem. Thanks!

ermanitu commented 5 days ago

Ops! Not possible. ComfyUI API requires saving the workflows in different format (saved by "save API format") than the format used by the ComfyUI itself. The only option is checking ComfyUI generation not using the 'workflow' attribute. It's possible to infer ComfyUI by some property of the 'prompt' variable?

I put here an example of the 'prompt' structure for sdxl workflow in the API format.

sdxl = ''' { "3": { "inputs": { "seed": 997541242705486, "steps": 20, "cfg": 8, "sampler_name": "euler", "scheduler": "normal", "denoise": 1, "model": [ "4", 0 ], "positive": [ "6", 0 ], "negative": [ "7", 0 ], "latent_image": [ "5", 0 ] }, "class_type": "KSampler", "_meta": { "title": "KSampler" } }, "4": { "inputs": { "ckpt_name": "sd_xl_base_1.0.safetensors" }, "class_type": "CheckpointLoaderSimple", "_meta": { "title": "Load Checkpoint" } }, "5": { "inputs": { "width": 1024, "height": 1024, "batch_size": 1 }, "class_type": "EmptyLatentImage", "_meta": { "title": "Empty Latent Image" } }, "6": { "inputs": { "text": "beautiful scenery nature glass bottle landscape, purple galaxy bottle,", "clip": [ "4", 1 ] }, "class_type": "CLIPTextEncode", "_meta": { "title": "CLIP Text Encode (Prompt)" } }, "7": { "inputs": { "text": "text, watermark", "clip": [ "4", 1 ] }, "class_type": "CLIPTextEncode", "_meta": { "title": "CLIP Text Encode (Prompt)" } }, "8": { "inputs": { "samples": [ "3", 0 ], "vae": [ "4", 2 ] }, "class_type": "VAEDecode", "_meta": { "title": "VAE Decode" } }, "10": { "inputs": { "images": [ "8", 0 ] }, "class_type": "PreviewImage", "_meta": { "title": "Preview Image" } } } '''

zanllp commented 4 days ago

I'll check how to implement it later. Of course, if you have any ideas, feel free to propose a PR!

ermanitu commented 4 days ago

Well, I tried to comment the 'workflorw' part and IB runs perfectly. I don't know if this has some others implication but for me is perfect.

def is_img_created_by_comfyui(img: Image): return img.info.get('prompt') #and img.info.get('workflow')

Thank you and congratulations for your software.