Closed Danamir closed 6 months ago
On the same subject, when adding ImageScale
nodes, we use the bilinear
algorithm by default. I would argue that using lanczos
(or ar least bicubic
) is preferable in almost every case, it is still not as sharp as using an upscale model, but way better than bilinear.
In fact the final ImageScale
of the workflow is also using bilinear
, which causes aliasing when scaling down, mostly visible on anime clear lines. In this instance the bicubic
method is as bad as the bilinear
one, but the results with lanczos
are very nice.
I think we should simply update all calls to lanczos
for downsampling and upsampling. Do you think there is a downside to this ?
In the following pictures pay close attention to the dark hair lines on light background to see visible aliasing :
Scaler | Downscaled image |
---|---|
bilinear |
|
bicubic |
|
lanczos |
So this is the case where the image (region) size is smaller than the checkpoint resolution. The workflows becomes "Upscale -> Diffuse -> Downscale". The main point of this mechanism is to avoid bad generation for low resolutions, since checkpoints generally can't handle them and produce garbage. It's not meant to sharpen results in the area, ideally they should be consistent with the lower resolution used by the canvas. That's why AI upscaler is not used here - if the render result appears a bit blury, that may actually be ok, because it's downscaled afterwards. Also AI upscalers can have detrimental effects too, so I want to be careful with using them automatically unless >50% diffusion happens afterwards, or it's explicitly desired.
Viewed from another angle, if you wanted sharper results of better quality, the recommendation should be to increase canvas size, using upscaler of your choice. Then there will be no implicit upscale before diffusion, and more importantly, you get to keep the full resolution render output, instead of throwing away most pixels. This is especially important for follow up refines, the upscale/diffuse/downscale workflow will always have detrimental effects, regardless of which upscaler is used.
I do very much agree with using Lanczso over Bilinear. I think it should mostly fix the case where bilinear introduces additional blur on top of the resolution change. It wasn't available in Comfy when I initially implemented this, and since got drowned somewhere on the TODO list...
I do very much agree with using Lanczso over Bilinear. I think it should mostly fix the case where bilinear introduces additional blur on top of the resolution change
Yes, I think it would be a sufficient fix. I wasn't aware of the problem until after having posted the first issue and continued poking around in the workflow debug.
Viewed from another angle, if you wanted sharper results of better quality, the recommendation should be to increase canvas size, using upscaler of your choice.
I tend to work at lower resolution, and using the inpaint as a detailer tool. But I get that it is not a recommended usage, so no worries if it is not fully supported in the plugin.
Using an upscaler model is kind of an overkill, but I still like the idea because it has a comparable feel to using the detailer nodes in ComfyUI. Fortunately I keep a custom fork for those kind of hacks. 😁
I tend to work at lower resolution, and using the inpaint as a detailer tool.
There was this issue where I tried to convince some A1111 veterans that it's an anti-pattern. Mostly without success I think ^^
It's interesting because there's also people here who come from Krita/Photoshop and are new to SD, and of course the canvas or photo is like 6000x4000, it's normal, and why should they downscale their painting or photos for SD? And I think it's a good point. Different mediums work at different resolutions, but the canvas should match the highest to avoid loosing information. And there should be no downscales for the same reason. SD generation can be upscaled where needed, because native resolutions that high are often just not practical - but preferably not the other way round.
Ideally the only point where you downscale anything is the final "export to PNG for reddit post" step. And I'd like to make that more realistic, but admittedly not quite there yet.
(Also those detailer nodes!, ppl make them downscale the high res result they already computed to insert it into tiny image... and then upscale that afterwards?!? like, it makes so much more sense to do it the other way round...)
There was this issue where I tried to convince some A1111 veterans that it's an anti-pattern. Mostly without success I think
It's interesting because there's also people here who come from Krita/Photoshop and are new to SD, and of course the canvas or photo is like 6000x4000, it's normal, and why should they downscale their painting or photos for SD? And I think it's a good point. Different mediums work at different resolutions, but the canvas should match the highest to avoid loosing information. And there should be no downscales for the same reason. SD generation can be upscaled where needed, because native resolutions that high are often just not practical - but preferably not the other way round.
working with a large image is always good, but processing (encoding\ decoding) such images is too resource-intensive for many users of non-top computers. right now I have found this add-on and the first samples showed good results for me and high speed, while maintaining the idea of working at maximum resolution. https://github.com/lquesada/ComfyUI-Inpaint-CropAndStitch
If it were possible for your application to implement such a feature, I believe many users would appreciate it. I am interested in hearing your thoughts on whether there are any significant drawbacks in this process.
It has worked like that from day 1 - you don't really need those nodes, they just make it more convenient when using Comfy's UI. As for drawback, a smaller context is faster but means you loose coherence with the rest of the image.
Not sure if this is a bug or by design but when doing a smallish inpaint (ie. a face), the
refine_region
method does this at the start :This method in turn uses
extent.initial_scaling
, which by default executes the following code :This is good for masks, but not for the source image as it becomes blurry because of the bilinear scaling used with
ScaleMode.resize
. The inpainted region is then not as sharp as it could. Shouldn'tScaleMode.upscale_fast
be used here ?I tried altering the code of
extent.initial_scaling
and the inpaint are indeed much sharper after the modification, at a minimal computing cost.Source region before resize :
ScaleMode.resize
)OmniSR_X4_DIV2K
(ie.ScaleMode.upscale_fast
)UltraSharp 4x
(I added
UltraSharp 4x
to give an example of an even sharper scaler).