Closed theflyingzamboni closed 1 year ago
I suspect that this is PIL optimizing the images for compressed size
No, PIL is just doing resizing correctly. If you downsample an image with alpha, you need to weight colors by their alpha or you'll get the artifacts I described in #1321. However, weighting by alpha also maps all transparent pixels to transparent black.
We might need to add an option like "Alpha corrected" to our resize nodes.
Is PIL's RGBa mode that Zamboni added not doing it correctly then?
RGBa is RGBA with pre-multiplied alpha, but I guess in some way that preserves the color channels?
https://pillow.readthedocs.io/en/stable/handbook/concepts.html
I couldn't find an article that explains pre-multiplied alpha in the context of resizing, so I'll quickly explain here.
Example: Suppose we have a 2x2 RGBA image, and we want to resize it to 1x1, so 50%. The image has 3 fully opaque red (RGBA=1 0 0 100%) pixels and one fully transparent green (RGBA=0 1 0 0%) pixel. We are using Area (Box) as our interpolation method since it's simplest, we just have to average all pixels in the image.
For later, let's write down some variables. I'll call our pixels p0
to p3
, and n=4
will be the number of pixels. Let aSum = p0.A + p1.A + p2.A + p3.A = 3
be the sum of all alpha values.
We know that the final alpha value of the resized pixel is going to be 75% (aSum / n
), so we only have to average the color channels now. How do we do that?
If we just average the RGB channels independently, we'll get RGB=0.75 0.25 0 #BF4000
((p0.rgb + p1.rgb + p2.rgb + p3.rgb) / n
) as the final color. Despite the fact that the green pixel is invisible, it still produced an orange as the final color. This color bleeding effect darkening is very visible. In the case of #1321, all transparent pixels were black, so we get a dark outline.
The solution to this problem is to multiply the color channels by their alpha value. This will yield RGB=1 0 0 ((p0.rgb * p0.a + p1.rgb * p1.a + p2.rgb * p2.a + p3.rgb * p3.a) / aSum
; note that we divide by aSum
instead of n
) which is the correct color. This is why PIL and most other image libraries use this method of averaging colors in RGBA mode. (Note that the color is undefined if all pixels are fully transparent. Most libraries will just use black in this case.)
However, this method will produce wrong results if your image already has premultiplied alpha. Since all pixels have already been multiplied by their alpha channel, doing it again will yield colors that are too dark (only applies to pixels with alpha value other than 0 and 1). So PIL tries to solve this by having an RGBa
mode for images with premultiplied alpha.
But what would happen if we tried resizing our decidedly not premultiplied-alpha image in RGBa
mode? We'd get RGB=1 0.33 0 #FF5500
((p0.rgb + p1.rgb + p2.rgb + p3.rgb) / aSum
; same formula as before, just without the * pi.a
since it's premultiplied), which is even worse than the naive averaging. But it gets even worse: since the formula is just wrong for non-premultiplied-alpha images, we can even get color value >1, we just happen to not get any with this example.
I'm trying to test your 3 red-1 green scenario in chaiNNer, but nothing I can figure out will keep that transparent pixel green by the time it gets there. Always ends up black. Testing with that though, you are correct that RGBa darkens the result, as compared to RGBA, so you're correct that we need an alternative method.
Annoying that game textures sometimes work in a way where the color of the transparent areas is important. The example image that person gave seems like the background shouldn't matter, as it's clearly just a sort of "fill alpha" pattern, not something intentional.
is there any way to detect premultiplied alpha? it would suck to have to put options everywhere
PIL by default attempts to automatically determine colorspace.
Premultiplied alpha doesn't solve zamboni's issue anyway. It looks like what zamboni wants is the naive averaging (each channel independently). OpenCV does that (again, see #1321), but you can get the same result with PIL by splitting RGB from alpha and resizing/rotation those 2 images separately and then combining them again. Filling for might cause problems for rotation though.
Correct. Splitting them was the first idea I had, but I thought this was a way to accomplish it without scaling twice. Apparently not. Should we have an option to scale alpha separately, rather than do it by default? It's really only necessary for certain game textures. Though Rotate already has an awful lot of options.
Is there a reason we shouldnt just do it all the time? Like how the upscale nodes always handle transparency by default?
What @RunDevelopment mentioned above, I believe. Transparent colors will influence visible ones, which is not always desirable.
What if we resize the RGB multiplied by the mask as well as the regular RGB, then overlay the pre-multiplied one over the original so the colors stay the same for everything that matters, then combine with the mask
then overlay the pre-multiplied one over the original
How will you do the overlay? If you're thinking of using a normal overlay (as in, "Normal" in our Blend node), then this isn't going to work. You'll still get incorrect colors. In my example above, the correct final color was RGBA=1 0 0 75%. If you overlay this on any of the other colors, you won't get the correct color again.
I thought of a solution, I think, but it seems kind of extra? Basically, let PIL handle the image correctly by doing normal RGBA scaling, then doing the same for the transparent pixels by making all the transparent colors opaque and vice versa, and then replacing color values in the first image with the values from the second where alpha == 0. Idk if there's a way to do that directly in PIL so that it's not necessary to convert twice as many images, because it's a big performance hit like this.
img_invert = np.dstack(
(img[:, :, :3], np.logical_xor(img[:, :, 3], np.ones_like(img[:, :, 3])))
)
pimg = Image.fromarray((img * 255).astype("uint8"))
pimg_invert = Image.fromarray((img_invert * 255).astype("uint8"))
pimg = pimg.resize(out_dims, resample=interpolation) # type: ignore
pimg_invert = pimg_invert.resize(out_dims, resample=interpolation) # type: ignore
resized_img = np.array(pimg).astype("float32") / 255
resized_img_invert = np.array(pimg_invert).astype("float32") / 255
imgout = np.dstack(
(
np.where(
np.expand_dims(resized_img[:, :, 3], 2).repeat(3, 2) == 0,
resized_img_invert[:, :, :3],
resized_img[:, :, :3],
),
resized_img[:, :, 3],
)
)
return imgout
This also isn't a good solution. Imagine an image with fully opaque red pixels and fully transparent green pixels in some shape and you resize it 25%. If you then look at the color channel of the resized image, you'll see sharp edges between the red and green parts of the image - an aliasing artifact.
Look, the problem is that there is no way to have both color-correct resizing and to preserve the color of transparent pixels (AFAIK). This is why I suggested letting users choose which method why want, either multiplied alpha, or color and alpha independently.
In the example zamboni just showed, color and alpha independently is what is needed. But multiplied alpha is what is way more useful in the common case.
Then yeah, we should just have a dropdown for alpha mode
Look, the problem is that there is no way to have both color-correct resizing and to preserve the color of transparent pixels (AFAIK).
😩
Can we close this now as "intended behavior"?
Intended behavior
I suspect that this is PIL optimizing the images for compressed size, but it seems it can cause issues for people working with 3D model textures. Either we need to figure out how to prevent PIL from doing this, or resize color and alpha channels separately.