huggingface / diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
https://huggingface.co/docs/diffusers
Apache License 2.0
26.08k stars 5.37k forks source link

Inpainting produces results that are uneven with input image #5808

Open vladmandic opened 11 months ago

vladmandic commented 11 months ago

Describe the bug

SD inpainting works fine only if mask is absolutely perfect.
Otherwise, there are always visible seams at the edge of the mask, and uneven colors between inpainted and input image.

I've tried manually assembling latents and using them for image and mask_image instead of images as well as manually assembling entire masked_image_latent - results are the same so i left reproduction as simple as possible.

Same behavior is visible in SD and SD-XL pipelines, using base model as well as dedicated inpainting models.
Non-diffuser inpainting implementations such as legacy A1111 implementation does not have this issue.

i've attached a very simple reproduction code that

  1. generates image
  2. creates mask as a square in the middle of the image.
  3. runs inpainting pipeline

Reproduction

import torch
import diffusers
from PIL import Image

model = '/mnt/f/Models/stable-diffusion/cyan-2.5d-v1.safetensors'

pipe0 = diffusers.StableDiffusionPipeline.from_single_file(model).to('cuda')
print(pipe0.__class__.__name__)
base = pipe0('seashore').images[0]
print(base)

mask = Image.new('L', base.size, 0)
square = Image.new('L', (base.size[0]//2, base.size[1]//2), 255)
mask.paste(square, (base.size[0]//4, base.size[1]//4))
print(mask)

pipe1 = diffusers.AutoPipelineForInpainting.from_pipe(pipe0).to('cuda')
print(pipe1.__class__.__name__)
inpaint = pipe1('house', image=base, mask_image=mask, strength=0.75).images[0]
print(inpaint)

base.save('base.png')
mask.save('mask.png')
inpaint.save('inpaint.png')

Logs

No response

System Info

diffusers==0.23.0

Who can help?

@patrickvonplaten @yiyixuxu @DN6 @sayakpaul

Examples

base

mask

inpaint

note: issue was originally reported at https://github.com/vladmandic/automatic/issues/2501 which you can check for additional examples.

patrickvonplaten commented 11 months ago

Thanks for the clean issue here! @yiyixuxu can you have a look?

yiyixuxu commented 11 months ago

hi @vladmandic:

I'm trying to compare with auto1111 but I'm seeing same issue - can you tell me if there is anything wrong with my setting?

Screenshot 2023-12-01 at 1 34 19 PM Screenshot 2023-12-01 at 1 38 15 PM
yiyixuxu commented 11 months ago

Played around with it a little bit more. I think the "mask blur" option helps with this issue. I will look into adding this in diffusers.

it is still not perfect though, let me know if there is anything else that I missed, I'm pretty new to auto1111 so it will help a lot if you can point me to the correct settings

mask blur = 0 mask_blur0

mask blur = 32 mask_blur32

vladmandic commented 11 months ago

I think mask blur is really good at "hiding" the issue with i paint, it would be a welcome addition to diffusers. Underlying problem still exists, but really unsure how else to address it.

yiyixuxu commented 11 months ago

@vladmandic

ok, I will add the mask blur! Agree that it does not seem to resolve the issue completely - but it seems like the underlying issue exists in both diffusers and auto1111, no? just want to make sure so that I don't waste more time digging into auto1111's code base

vladmandic commented 11 months ago

I'll dig into it more, you can focus on mask blur. If I find something else I'll update here.

23pennies commented 11 months ago

I have had the time to test this out further and it looks like it's indeed very similar between Diffusers and the original backend - but not the same. The best way to test this is an image with contrast, and the mask covering more than one object, like background/foreground/clothing and such. Tests were done using @vladmandic's UI. My test image: inpaintx1 The mask (no blur applied): inpaintx1mask Results with the original backend (assuming it's the same as auto1111): originalbackend Results with the diffusers backend: diffusersbackend In both, the hue of the shirt is changed slightly, but I think it's (maybe subjectively) worse in diffusers. However, if we zoom in at the shoulder: zoom_originalzoom_diffusers In the original, the fuzzy background color is almost untouched. In the diffusers example, the fuzzy background color is getting almost the same treatment as the shirt, getting desaturated in what looks like an identical amount to the shirt. I think a bit of a color shift is expected, maybe because of latent encoding, maybe because the model only "knows" certain colors or shades. So you would get different results with different colors, objects, and so on. But diffusers adds a plain discoloration to the image.

I also saw that the preview of the generation process hints at a difference: imageimage Original backend only adds noise to the masked area, and denoises it. Diffusers seems to add noise to the entire image, and then reconstructs it somehow? Maybe during that reconstruction, color information gets lost. One thing to consider: When using auto1111 or derivative UIs, inpainting always pastes the generated image on top of the original with the mask applied. My theory is that the discoloration actually affects the entire generated image, and is only seen as a "border" in the result because of the post processing. Maybe @vladmandic could add a setting to his UI that outputs the unprocessed result, for debug purposes?

vladmandic commented 11 months ago

inpainting always pastes the generated image on top of the original with the mask applied.

there is no such thing, all "magic" happens in preprocessing. the difference in live preview is likely due to "mask only" vs "full image".

23pennies commented 11 months ago

If you're referring to "Inpaint area", I always use the "Whole picture" option.

23pennies commented 11 months ago

Using the TAESD live preview method, I can see no visible mask seams in the latents, and the whole picture seems to be discolored (but that could be TAESD): image Final image with visible seams: image SDXL inpainting is a lot worse with discoloration: image But again, no visible seams in the preview, everything is "equally" discolored: image

vladmandic commented 11 months ago

that's interesting - can you try in img2img advanced -> disable full quality - that basically forces usage of taesd for final decode as well.

23pennies commented 11 months ago

Sure: image

vladmandic commented 11 months ago

So it's not a VAE thing, thus must be diffusers postprocessing?

23pennies commented 11 months ago

So it's not a VAE thing, thus must be diffusers postprocessing?

I don't know for sure if the VAE is involved, but the diffusers are definitely not doing it, since I just found the culprit in the UI: https://github.com/vladmandic/automatic/blob/69bda18e239a8b4d7b9a3a2a7fd450f69351cbae/modules/processing.py#L940C38-L940C38 I added output_images.append(image) before this line and got a grid with the unprocessed and the processed result: image This should be useful as a setting. The full image discoloration can be jarring, but might be preferable over the mask seams, and might be easier to fix with an image editing program.

vladmandic commented 11 months ago

This should be useful as a setting. The full image discoloration can be jarring, but might be preferable over the mask seams, and might be easier to fix with an image editing program.

good point, i'll add it.

github-actions[bot] commented 10 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

castortroy83 commented 9 months ago

push

yiyixuxu commented 9 months ago

@castortroy83 we added two auto1111 features https://github.com/huggingface/diffusers/pull/6072 here that will help with the inpainting generation and mask edge issue

github-actions[bot] commented 9 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

sayakpaul commented 8 months ago

Hope this is solved now?

23pennies commented 8 months ago

Sorry, while the part on the UI was a workaround that's better than nothing, it doesn't fix the underlying issue of discoloration. I didn't want to push this further, as I thought it's just how inpainting works. But I got into trying out ComfyUI and it does inpainting almost perfectly. For comparison, here is the result with Diffusers on SD.Next, everything updated to the latest version. Denoising at 0.99, with the sdxl 0.1 inpainting model: inpaintnewdiffusers And here is the ComfyUI result: inpaintnewcomfy The inpainting is nearly perfect and there is almost no color shifting at all. It tells me that it's possible and should be worthwhile to pursuit a proper fix for this.

sayakpaul commented 8 months ago

Are the models same for your tests? If so, Ccing @patil-suraj @yiyixuxu here.

Cc: @vladmandic as well.

23pennies commented 8 months ago

Same model, same sampler, same denoising. Another pointer to a possible cause/solution: In ComfyUI, the nodes for the above output look like this: goodoutput For curiosity's sake, I tried giving the sampler a separately decoded latent, instead of the one from the Inpaint Conditioning node: badoutput The result: comfybadcolors Similar discoloration.

sayakpaul commented 8 months ago

Alright. Could you maybe provide your diffusers code snippet?

Also

For curiosity's sake, I tried giving the sampler a separately decoded latent, instead of the one from the Inpaint Conditioning node:

Could you expand a bit more on this?

asomoza commented 8 months ago

I did some tests, the image you're using is not a 1024x1024 so I upscaled it to test the difference between them and I don't see that much difference with the comfyui results:

Edit: I was comparing to the "bad" results of comfyui, I get what you mean now. I'll dig deeper into this.

Normal SDXL

source inpaiting diff
288740948-fca2efc3-91af-42d3-adac-0cd705adc1ea_1024 inpainting_20240227120059_3820591073 inpainting_20240227120059_3820591073_diff

Inpainting SDXL

source inpaiting diff
288740948-fca2efc3-91af-42d3-adac-0cd705adc1ea_1024 inpainting_20240227120201_32298360 inpainting_20240227120201_32298360_diff

Inpainting SDXL (blurred mask)

source inpaiting diff
288740948-fca2efc3-91af-42d3-adac-0cd705adc1ea_1024 inpainting_20240227120536_3630699147 inpainting_20240227120536_3630699147_diff
asomoza commented 8 months ago

I tested it more and the difference was that the comfyui uses by default the "only inpaint mask option" so it only affects the area around the mask. With this code:

  image= pipe(
      prompt,
      image=base,
      mask_image=mask_blurred,
      guidance_scale=8,
      strength=0.99,
      num_inference_steps=20,
      generator=generator,
      padding_mask_crop=32,
  ).images[0]

The results are the same as comfyui:

inpainting_20240227152200_2221240031 inpainting_20240227152206_164228288
inpainting_20240227152212_1201827711 inpainting_20240227152219_1476733163
sayakpaul commented 8 months ago

@23pennies does the comment above from @asomoza help?

23pennies commented 8 months ago

@asomoza Could you say which variable in that snippet is for the "only inpaint mask option"? Also, which model were you using?

asomoza commented 8 months ago

@asomoza Could you say which variable in that snippet is for the "only inpaint mask option"? Also, which model were you using?

padding_mask_crop=32

https://huggingface.co/docs/diffusers/using-diffusers/inpaint#padding-mask-crop

and I tested it with the inpainting model which seems to "decolorize" the image more than the normal one.

https://huggingface.co/diffusers/stable-diffusion-xl-1.0-inpainting-0.1

23pennies commented 8 months ago

I'm using SD.Next, it doesn't look like it's implementing it. I've tried hacking it in myself, and the discoloration still happens: inpaintPadding I then tried hard-coding the arguments so they're as close to yours as possible:

        output = shared.sd_model(
            "necklace",
            image=p.init_images[0],
            mask_image=p.image_mask,
            guidance_scale=8,
            strength=0.99,
            num_inference_steps=20,
            generator=torch.Generator(device="cuda").manual_seed(0),
            padding_mask_crop=32,
        )

And the results are still discolored: inpaintpaddinghardcoded However, you mentioned blurring. I blurred the mask (manually this time): blurredmask and the results are overall less discolored: inpaintpaddinghardcodedandblurredmask But this still doesn't seem to be the solution. The discoloration is still there sometimes, and the blurred mask adds additional problems. You can see in the above example where two buttons are on top of each other, the lower one is faded out. That's the pipeline blending the result with the original image. The non-blended result is this: inpaintpaddinghardcodedandblurredmaskbutnooverlay In ComfyUI this doesn't happen, as I get nearly perfect results without blurring the mask. Also, you said

I tested it more and the difference was that the comfyui uses by default the "only inpaint mask option" so it only affects the area around the mask.

Could you point me to where you found this? My understanding of how ComfyUI works doesn't align with it.

asomoza commented 8 months ago

Could you point me to where you found this? My understanding of how ComfyUI works doesn't align with it.

looking at the code of that node:

https://github.com/comfyanonymous/ComfyUI/blob/1abf8374ec690972ba512a50eeaa767935536441/nodes.py#L400-L406

        m = (1.0 - mask.round()).squeeze(1)
        for i in range(3):
            pixels[:,:,:,i] -= 0.5
            pixels[:,:,:,i] *= m
            pixels[:,:,:,i] += 0.5
        concat_latent = vae.encode(pixels)
        orig_latent = vae.encode(orig_pixels)

I'm no expert in ComfyUI and I think that one of the weak points of comfyui is that there's almost no documentation, or at least I couldn't find anything related to that node.

The normal node and what's documented here:

https://comfyanonymous.github.io/ComfyUI_examples/inpaint/

is the default in diffusers and the result is the same too:

ComfyUI_00208_

I really don't know if the hard coding would work in sd.next, if it doesn't do anything on top of diffusers, the results should be exactly the same, if it doesn't, then IMO that's something to discuss in the sd.next repo.

Also I found that the inpainting model discolors things instead of changing them, probably because it doesn't have enough information to do it, but If I do this prompt:

"red shirt"

it does the discoloration thing instead of painting it red, and if I use the normal SDXL model with ComfyUI I get this:

InpaintModelConditioning Vae Encode (for inpainting)
ComfyUI_00217_ ComfyUI_00216_

You can clearly see that it is not inpainting the whole image but blending the mask section in, also the InpaintModelConditioning most of the time doesn't change the color either, even with a whole shirt mask.

InpaintModelConditioning Vae Encode (for inpainting)
ComfyUI_00230_ ComfyUI_00231_

Also you'll need to take into account that the results are also a lottery, it all depends if you get a good seed for what you're asking it to do.

That's the pipeline blending the result with the original image

it is something that all of the solutions need to do, unless you inpaint the whole image you need to blend the new inpaint section with the old image, there are techniques that makes this better like InvokeAI that uses patchmatch but most people just do a second pass over the whole image.

If you want better results I recommend you use the new Differential Diffusion, @vladmandic wrote that he added it to sd.next, and IMO it's a lot better also I think automatic1111 just released a soft inpaint but I don't know if that is in sd.next yet.

23pennies commented 8 months ago

Thanks for the reply. Sorry I respond so sporadically.

https://github.com/comfyanonymous/ComfyUI/blob/1abf8374ec690972ba512a50eeaa767935536441/nodes.py#L400-L406

I'm not seeing an equivalent to padding_mask_crop here. What this piece of code seems to do is it rounds each pixel in the mask up or down to white or black (so a blurred mask doesn't actually do much), then turns every pixel in the image that is covered by the mask to grey. I hacked together a node that returns the "pixels" value to confirm it: hackedtogethernode Then it encodes this and the original image into a latent, and returns an object that has both of these latents as elements. There is no padding or cropping going on. The entire image is sent to the sampler node. In Diffusers, with padding_mask_crop, the image is cropped: https://github.com/huggingface/diffusers/blob/b4226bd6a742640d98bf0ce17e984cdc92f4cdf6/src/diffusers/image_processor.py#L504 then later stitched into the original image: https://github.com/huggingface/diffusers/blob/b4226bd6a742640d98bf0ce17e984cdc92f4cdf6/src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl_inpaint.py#L1789 (mimicking the functionality that's already in automatic and SD.Next, that's why SD.Next doesn't use it) In ComfyUI, this wouldn't be possible; the KSampler node has no access to a vae that could turn latents to images for cropping, and vae decode has no stitching-together functionality. Further evidence for this - I manually added gaussian noise to the masked area (left picture). The sampler in ComfyUI (middle picture)correctly sees the entire image as a context and paints a guy's chin, a blue shirt and a strange chain (from the prompt "necklace"). Diffusers with padding_mask_crop=32 has no clue what is going on because the context was cropped out (right picture): noisecompare

The normal node and what's documented here:

https://comfyanonymous.github.io/ComfyUI_examples/inpaint/

Can't say I'm a fan of its documentation. The VAE Encode for Inpaint node is outdated and the InpaintModelConditioning is what should be used.

I really don't know if the hard coding would work in sd.next, if it doesn't do anything on top of diffusers, the results should be exactly the same, if it doesn't, then IMO that's something to discuss in the sd.next repo.

I just mentioned it to rule out the possibility of something else messing up, so it's just those parameters.

Also I found that the inpainting model discolors things instead of changing them, probably because it doesn't have enough information to do it, but If I do this prompt:

Also I found that the inpainting model discolors things instead of changing them, probably because it doesn't have enough information to do it, but If I do this prompt: "red shirt" it does the discoloration thing instead of painting it red, and if I use the normal SDXL model with ComfyUI I get this: (...) You can clearly see that it is not inpainting the whole image but blending the mask section in, also the InpaintModelConditioning most of the time doesn't change the color either, even with a whole shirt mask.

Yeah, I noticed that even in ComfyUI the discoloration starts happening to the whole image if the mask is big enough, but even so, it's not as harsh as Diffusers. And inpaint Models aren't perfect, and with some specific cases like the one here (red shirt) they fail. In this case, "green shirt" works much better. But, this seems to be on the model anyway, I get the same results with Diffusers.

Also you'll need to take into account that the results are also a lottery, it all depends if you get a good seed for what you're asking it to do.

Yeah, I double and triple check everything with a dozen different seeds.

it is something that all of the solutions need to do, unless you inpaint the whole image you need to blend the new inpaint section with the old image, there are techniques that makes this better like InvokeAI that uses patchmatch but most people just do a second pass over the whole image.

If ComfyUI blends the results with the original image, it must do so in latent space. Maybe that's the key difference? Could we perhaps try that in Diffusers?

If you want better results I recommend you use the new Differential Diffusion, @vladmandic https://github.com/huggingface/diffusers/issues/7038#issuecomment-1958692925 that he added it to sd.next, and IMO it's a lot better also I think automatic1111 just released a soft inpaint but I don't know if that is in sd.next yet.

Differential diffusion is amazing, but it has its limits. In my testing, it couldn't handle denoising above 0.8, so inpainting models are not dead, yet.

asomoza commented 8 months ago

Thanks for the reply. Sorry I respond so sporadically.

No problem, I also want to find the differences between diffusers and comfyui, so I'm just glad that someone wants to also put some effort into this instead of just wanting answers.

What this piece of code seems to do is it rounds each pixel in the mask up or down to white or black (so a blurred mask doesn't actually do much), then turns every pixel in the image that is covered by the mask to grey.

yeah, I didn't go any further with this since I believed that the orig_latent and the mask were blended later in comfyui core and it was just using the mask to inpaint. This is just my lack of knowledge on how comfyui works, I'll probably do a full trace of what is doing before assuming things just by the name of variables.

If ComfyUI blends the results with the original image, it must do so in latent space. Maybe that's the key difference? Could we perhaps try that in Diffusers?

So far I found while discussing this with you that comfyui does two things different than diffusers:

  1. It replaces the area to be inpainted in the image with grey (0.5) (probably makes the inpainting over it better?)
  2. It does the blending in latent space? (need to trace it to where it does this)

But just with this we can almost be sure that the whole image is being passed to the inpaint probably to grab de context and then only the masked part is blended back in.

I don't think this functionality is in diffusers though, we only have the options to pass the whole image and return the whole inpainted image or to just pass the masked section of it and get the original image with the blended masked section, maybe @yiyixuxu can corroborate this.

So what we're missing is sending the whole image for inpainting and get back the masked part blended with the original latents, maybe can use a nice way of doing this like this code from automatic1111:

https://github.com/AUTOMATIC1111/stable-diffusion-webui/blob/bef51aed032c0aaa5cfd80445bc4cf0d85b408b5/extensions-builtin/soft-inpainting/scripts/soft_inpainting.py#L49-L102

I'll try to do a PoC of this when I have the time and see how it goes.

Differential diffusion is amazing, but it has its limits. In my testing, it couldn't handle denoising above 0.8, so inpainting models are not dead, yet.

I did a test with 0.9 and it doesn't look that bad though (cherry-picked):

differential_20240308193428_714159705

I did the same with comfyui and it looks worse IMO (cherry-picked):

ComfyUI_00256_

23pennies commented 8 months ago

I had to update my understanding of differential diffusion, so here's another attempt. Less structured this time because it's not actually directly comparable to inpainting models. Also, the differential diffusion script in SD.Next was broken so I did this with ComfyUI, but I haven't seen a qualitative difference between the two.

Differential diffusion did well in the above case with a fairly simple image, but breaks down with more complex scenes.

Inpainting models do very well taking the entire image into consideration. I have this base image: scifisoldier

I want to replace the soldier with a cowboy. The inpainting model does great at 1.0 denoising. The planet in the background for example is restored almost perfectly where it was previously occluded by the soldier: cowboy1

If I leave the gun barrels unmasked, the inpainting model adapts the entire masked area to accomodate this new detail, despite being the same seed: cowboy2

Differential Diffusion (at 0.9 denoising) breaks completely with a hard mask. It's actually almost identical to inpainting without differential: cowboydiffhard

Differential diffusion only starts working with a soft mask. The inpainted content blends in much better, and you might prefer these results because they're more interesting, but it still has problems such as the barrels becoming floating black sticks, the horse fading out at the edges, and the planet is generally not restored as nicely: cowboydiffsoft

The problems are more severe at 1.0 denoising: cowboydiffsoft1 0

And better at 0.8, but the image sticks closer to the original: cowboydiffsoft0 8

So my takeaway for now is: Differential diffusion requires blurred masks, which are harder to control, and only works well at denoising levels that can't replace entire objects. I can see use cases for it, but so far it's not replacing inpainting models.

vladmandic commented 8 months ago

differential diffusion is not a replacement for standard inpainting nor its supposed to be. and yes, it works with grayscale masks as its designed to, it doesn't work well with hard masks.

for your example above, try combining differential diffusion with a depth model to create a mask - it will follow original input precisely while replacing your soldier with cowboy. (tune denoise strength as well as mask intensity to your liking, this is denoise=0.5 and intensity=0.1)

and if you want to fine-tune, then you take that mask and do further edits. image

asomoza commented 8 months ago

IMO we need to draw a line here, I was just concentrating on replicating the same functionality of comfyui and not trying to achieve the SOTA of inpainting, I did post about differential as a suggestion because IMO is a lot better than normal inpainting and I disagree, in my tests and use cases it completely replaces normal inpainting but each of us has our own opinion about this and is all good.

Just to be fair with differential, I'll post my results with your example.

Using what I call a big bad mask done in seconds with GIMP and a prompt "cowboy on a horse":

mask result 1 result 2 result 3 result 4
big_bad_mask differential_20240310184032_1325605670 differential_20240310184057_3377451109 differential_20240310184204_1843503695 differential_20240310184231_1736788992

with a better mask done also in seconds and a prompt "cowboy":

mask result 1 result 2 result 3 result 4
soft_differential differential_20240310184452_2008859257 differential_20240310184512_110629027 differential_20240310184555_555918086 differential_20240310184628_1401471668

This result can be improved if you use a depth map and more precise mask over it to keep more of the details of the original image, also with some inpainting and a second pass over it. But good results also can be achieved with the normal inpainting and similar techniques, if you want a full blown inpainting with the best quality of this I can do it but IMO is not the issue we're discussing here.

Going back the original issue, what we're trying to achieve is to come closer to a "one step inpainting with a binary mask" similar to comfyui.

So far I got the gray masking (which doesn't improve the result) and also the insert back the inpainted part back in the original, but I can now clearly see that the new inpainting always comes discolored, here's an example if I blend them in the latent space:

Normal inpainting Inpainted + original
inpainting_20240310232735_1435054893 new_inpainting

This is an example if I do it after the vae decode:

Normal inpainting Inpainted + original
inpainting_20240311005859_94113809 new_inpainting

It doesn't matter how many times I try, it always return a "washed out image" so I'm also starting to think there's something wrong with diffusers, I did a quick debug in comfyui and I couldn't find any code that does something different, it just returns the image with the correct colors and saturation.

This is something I can fix with just matching the histogram of the inpainted part to the original image:

new_inpainting

but I don't think this is the correct method to fix this issue, the inpainting shouldn't return a washed out image, also this only happens with the inpaint model and not the normal one or with the padding_mask_crop option.

edit: I just found out that this is a 5 month or older issue so probably it won't get fixed. Also is under the limitations:

When the strength parameter is set to 1 (i.e. starting in-painting from a fully masked image), the quality of the image is degraded. The model retains the non-masked contents of the image, but images look less sharp. We're investing this and working on the next version.

But I found out that even at 0.7 it still returns a discolored image so I guess my solution is as good as any, if anyone wants it I can clean it and post it.

exx8 commented 7 months ago

Hello, I'm the author of diff-diff. My apologies for the delayed response. Thank you for conducting experiments with diff-diff. Regarding its interaction with inpaint: While it's true that the current implemtnation of diff-diff can serve as a replacement for inpaint models in certain applications, the current implementation relies on the general checkpoint, which may not be ideal for some use cases. However, I believe that diff-diff can be used with inpaint checkpoints effectively.

If I recall correctly, inpaint checkpoints accept additional channels of the inpaint mask and a covered picture. This aligns perfctly with the algorithm - the key idea that on every step, those components are upadted, according to the current threshold mask. I believe that integrating diff-diff with inpaint checkpoints might address the issues raised by @23pennies @asomoza.

asomoza commented 7 months ago

Hi @exx8 , I will look into this then, I was happy with the results using normal models, if you're saying that with inpainting models could get even better, is worth investigating it.

exx8 commented 7 months ago

Hi @exx8 , I will look into this then, I was happy with the results using normal models, if you're saying that with inpainting models could get even better, is worth investigating it.

It really depends on the context. Some edits may be better with the general checkpoint, while others might be better with the inpaint ones. It's worth noting that the strength values might have different impacts between different models. It's expected that inpaint models will be more modest with their changes for the same strength value.

github-actions[bot] commented 6 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

polavishnu4444 commented 5 months ago

Hi, Any update to fit in diffdiff with inpainting checkpoint?

polavishnu4444 commented 5 months ago

Also, is there any conclusion with current inpainting that we can circumvent for preventing the discoloration from happening?

github-actions[bot] commented 1 month ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.