vladmandic / automatic

SD.Next: Advanced Implementation of Stable Diffusion and other Diffusion-based generative image models
https://github.com/vladmandic/automatic
GNU Affero General Public License v3.0
5.36k stars 382 forks source link

[Feature]: Let's brainstrom together and find a way to export informations about images INPUTED (MUST NEEDED feature for img2img fanatics) #2074

Closed StableInfo closed 11 months ago

StableInfo commented 11 months ago

Feature description

Hello, I have been longing for a feature that saves infos about images used as inputs for img2img (and also controlnet btw), here : https://github.com/vladmandic/automatic/issues/2067 and here: https://github.com/vladmandic/automatic/issues/1442

I think I might have a solution but since I am not mastering the code very well yet, I dont think I can implement it yet, I will need help. Here is the idea: We will need 2 functions: 1) Function 1, will, whenever img2img process is invoked and before the image is transformed to base64, save infos about the image inside a text file which will be inserted inside "outputs/todays date", the folder will also contain the img used and will be titled with the name of the image. 2) Function 2, will, after the img2img process is completed, go search for the folder that contained the text file, find the file, and modify its name with the seed_prompt of the actual outputed image.

If the folder named after the image input already exists then the function1 will not re create it, it will simply add a new text file, obviousely it will not create a copy of the input image. Function 2 will use some "general variable" which will be set before hand (when function 1 is running) to be able to identify which text file is the right one, and avoid renaming the wrong text file.

This way the whole img2img process is not altered, HOWEVER, the USER can go check on the folders created to see if he can find a text file with a similar name (seed_prompt) to one of his outputs, this way he will be able to reproduce his img2Img workflows.

The final result could be: image

Just the fact that the folder is named after the image input is already a big step (we might not need to insert info in the text files created inside this folder, except for naming of these files (function 2)), indeed we already have a copy of the img input inside the folder (it should contain everything we need)

Whenever the user needs to know which img input was used during a worflow, he can go filter the list of folders inside the "img2img2 extra info" upper folder. Then he tries to find the name of the output he is studying (seed_prompt = "00371-amazing breathtaking red orange blue yellow green lower" for example), if he finds it then he knows that this output was created using the copy of the image created within the same folder, or simply by checking the name of the folder (named after the img input).

What do you think of my idea?

Another simpler method that came to mind while wrinting this, is to create a txt file that contain lines like this: Seed_Prompt (output files) - name of the img input Seed_Prompt (output files) - name of the img input Seed_Prompt (output files) - name of the img input Seed_Prompt (output files) - name of the img input ... The user will simply have to check this file and search for the output file name (seed+propt) to find which img input was used. The function will have to be able to rewrite the big txt file after or during the img2img is being or been done. Something like that.

Can anyone make this? I really need to keep info about image inputs I am using to be able to reproduce my results, if this works well then we will be able to reproduce it for (outpaining and controlnet etc).

By the way, the classical text file created inside the usual outputs folder stays untouched and is created if the option is checked, nothing changes.


So I was thinking that function 1 could be inserted inside:

Version Platform Description

No response

Aptronymist commented 11 months ago

We do already have two methods of capturing the full generation info for each image into external files: image

the json log looks like this:

{ "filename": "outputs\grids\grid-0033-stunning young woman in a dark alley at", "time": "2023-08-25T21:16:37.670458", "info": "stunning young woman in a dark alley at night, wearing only a long black leather trench coat with nothing else under it, she opens the coat to show her slim petite nude body, (extremely detailed naked woman), as she approaches she looks at the viewer with a sultry and seductive attraction, tilt-shift, solo focus, nsfw\nNegative prompt: (low quality, worst quality, blurry, fuzzy, hazy, out of focus, poorly drawn), (grainy, indistinct, unclear, busty, plump), sfw, \nSteps: 30, Seed: 2716219983, Sampler: DPM++ 3M SDE, CFG scale: 8, Size: 600x800, Batch: 1x4, Parser: Full parser, Model: epicphotogasm_v1, Model hash: e48ca7f826, Clip skip: 2, Image CFG Scale: 6, Backend: Original, Version: 2e13c98, Operations: txt2img, Hashes: {\"model\": \"e48ca7f826\"}" }

I send that to img2img, latent upscale with a different sampler for about 20 steps, it outputs this to the log:

{ "filename": "outputs\image\00000-stunning young woman in a dark alley at", "time": "2023-08-25T21:26:17.525296", "info": "stunning young woman in a dark alley at night, wearing only a long black leather trench coat with nothing else under it, she opens the coat to show her slim petite figure, completely nude, (extremely detailed naked woman), as she approaches she looks at the viewer with a sultry and seductive gaze, tilt-shift, solo focus, nsfw\nNegative prompt: (low quality, worst quality, blurry, fuzzy, hazy, out of focus, poorly drawn), (grainy, indistinct, unclear, busty, plump), (sfw, pants, shorts, panties, censorship, censored)\nSteps: 25, Seed: 270668499, Sampler: DPM++ 2M SDE Karras, CFG scale: 6, Size: 900x1200, Parser: Full parser, Model: epicphotogasm_v1, Model hash: e48ca7f826, Init image hash: 342b7ff890dea5bba0656e06da95df98, Image CFG Scale: 1.5, Backend: Original, Version: 2e13c98, Operations: img2img, Resize mode: 4, Hashes: {\"model\": \"e48ca7f826\"}" }

and leaves this in the outputs/image folder:

image

So it seems to me that we're already doing quite a lot, and you may want to look through your Settings options.

vladmandic commented 11 months ago

and one thing i've tried to highlight several times already - there is no concept of input image name. as soon as you upload or drop image to browser, its just pixel data. its not converted by app and server does not have image-name-before-convert - image name simply does not exist on server. and even if it did, image name means nothing to anyone except on original system.

the concept of extra files and/or folders is just no.

you already have input image hash in metadata. nothing is stopping you from creating some kind of image browser that would keep name and hash so you can search image by that hash. as far as i'm concerned having input image hash is only relevant information as its not linked to users files/folders. so question is how do you find image based on its hash. that can be a full fledged image viewer app or a small tool or whatever. but its not up to sdnext to do that.

StableInfo commented 11 months ago

I am not sure you are getting what I am asking for Aptronymist, does your text file contain info about the "source" image = input image = base image used in the img2img process? I am trying to save the info about the images located here: image

StableInfo commented 11 months ago

you already have input image hash in metadata. nothing is stopping you from creating some kind of image browser that would keep name and hash so you can search image by that hash. as far as i'm concerned having input image hash is only relevant information as its not linked to users files/folders. so question is how do you find image based on its hash. that can be a full fledged image viewer app or a small tool or whatever. but its not up to sdnext to do that.

So isn't it possible to keep the "image hash" inside some text file? Then we can brainstorm later to know how to find the image using the hash?

vladmandic commented 11 months ago

But... You already have that.

StableInfo commented 11 months ago

Ok I just tried an experiment with this script:

# Import the libraries
from PIL import Image
import imagehash

# Load the target image and compute its average hash
target = Image.open("target.jpg")
target_hash = imagehash.average_hash(target)
print("Target hash:", target_hash)

# Load the database of images and compare their hashes with the target
database = ["image1.jpg", "image2.jpg", "image3.jpg"]
for image in database:
    img = Image.open(image)
    img_hash = imagehash.average_hash(img)
    print("Image hash:", img_hash)
    # If the hashes are equal or have a small hamming distance, they are similar
    if target_hash == img_hash or (target_hash - img_hash) < 5:
        print("Found a match:", image)
        break
else:
    print("No match found")

And it worked.


How can I "collect" the image hash of the input image?

You are saying that the text files generated next to every output contain the "image hash" of the input images during img2img process?

Here is the text file I generated yesterday when using img2img:

Indie game art,(amazing breathtaking (((magenta))) , (((blue gray flower))) on a spectral background ), (Vector Art, Borderlands style, Arcane style, Cartoon style), Line art, Disctinct features, Hand drawn, Technical illustration, Graphic design, Vector graphics, High contrast, Precision artwork, Linear compositions, Scalable artwork, Digital art, cinematic sensual, Sharp focus, humorous illustration, big depth of field, Masterpiece, trending on artstation, Vivid colors, trending on ArtStation, trending on CGSociety, Intricate, Low Detail, dramatic, detailed skin texture, (blush:0.2), (goosebumps:0.3), subsurface scattering, (Studio ghibli style, Art by Hayao Miyazaki:1.2), Anime Style, Manga Style, Hand drawn, cinematic sensual, Sharp focus, humorous illustration, big depth of field, Masterpiece, concept art, trending on artstation, Vivid colors, Simplified style, trending on ArtStation, trending on CGSociety, Intricate, Vibrant colors, Soft Shading, Simplistic Features, Sharp Angles, Playful Negative prompt: NSFW, Cleavage, Pubic Hair, Nudity, Naked, Au naturel, Watermark, Text, censored, deformed, bad anatomy, disfigured, poorly drawn face, mutated, extra limb, ugly, poorly drawn hands, missing limb, floating limbs, disconnected limbs, disconnected head, malformed hands, long neck, mutated hands and fingers, bad hands, missing fingers, cropped, worst quality, low quality, mutation, poorly drawn, huge calf, bad hands, fused hand, missing hand, disappearing arms, disappearing thigh, disappearing calf, disappearing legs, missing fingers, fused fingers, abnormal eye proportion, Abnormal hands, abnormal legs, abnormal feet, abnormal fingers Steps: 20, Seed: 2861653623, Sampler: Euler a, CFG scale: 12, Size: 691x1121, Batch: 12x1, Parser: Full parser, Model: dreamshaper_331BakedVae, Model hash: 1dceefec07, VAE: vae-ft-mse-840000-ema-pruned, Image CFG Scale: 1.5, Backend: Original, Version: 6a4d4ea, Operations: img2img, Resize mode: 0

I don't see any image hash here.

vladmandic commented 11 months ago

If it's not included, then it's an issue, it should be. I'll check.

StableInfo commented 11 months ago

is this a recently added feature? My build is 4 days ago:

app: SD.next
updated: 2023-08-22
hash: 6a4d4ea5

And I never ever in mylife saw the image hash of the image input inside the text files.

StableInfo commented 11 months ago

If it's not included, then it's an issue, it should be. I'll check.

When would this be fixed by your estimations? I am making so many img2img images right now, I am certain I will have such a hard time reproducing my flows.

Aptronymist commented 11 months ago

is this a recently added feature? My build is 4 days ago:

app: SD.next
updated: 2023-08-22
hash: 6a4d4ea5

And I never ever in mylife saw the image hash of the image input inside the text files.

Well than I don't have any idea how mine above has that info, which it does if you look. As does another test one I just did from a random image I drag and dropped into from image. The json image log is also keeping track of init image hashes, as well as them showing up in the text file for each new image in outputs/image AND if you have it turned on, the init-images folder will get a copy of the image you used for the img2img process, with the file named the hash of the image itself. image

image

image

image

Aptronymist commented 11 months ago

So, do you have these things turned on?

image and image and image

vladmandic commented 11 months ago

correct - image hash is only saved when save copy of processing init images is enabled. and then its actually used as a filename when saving that init image.

also, if you want to run manual calculations, this is the actual hash formula (you used 3rd party lib averagehash which would generate completely different hash):

hashlib.sha256(img.tobytes()).hexdigest()[0:8]