Datashader shapes outlines and fill_alpha documentation

Sonja-Stockhaus commented 2 weeks ago

Status after #309, just to document some thoughts/details about the comparison between ds and mpl:

outlines linewidth with datashader seems to vary slightly between e.g. polygons and circles, which is not the case with matplotlib
below you see that the outlines of the circles are slightly cut in places where the extent of circles ends, but is later on expanded because of the polygons. This is similar to what we see with points. It should only be a problem with large linewidth values though.
fill_alpha in ds looks lighter than in mpl, even though it's 0.5 both times. Reason: in ds, the fill_alpha is applied two times: once during the shade() step, to make sure elements are blended with each other. Then also during the rendering of the resulting image, to make sure other things that were plotted before still shine through. Possible fix: use sqrt(fill_alpha) both times. Problem: if sth else is rendered below, it would be obvious that the elements are less transparent than the user would expect (e.g. 0.7 instead of 0.5). That could be confusing.

import spatialdata_plot
from spatialdata.datasets import blobs
blob=blobs()

blob.pl.render_shapes(method="datashader", fill_alpha=0.5, outline_alpha=0.7, color="red", outline_color="blue", outline_width=5).pl.show()

blob.pl.render_shapes(method="matplotlib", fill_alpha=0.5, outline_alpha=0.7, color="red", outline_color="blue", outline_width=5).pl.show()

grafik

LucaMarconato commented 2 weeks ago

Thanks for reporting. I comment on 3. Could you please link the two code places where the alpha is used? I wonder if we can set the second alpha to be 1. Maybe it is a bug of datashader that we can report/fix?

Sonja-Stockhaus commented 2 weeks ago

Sure: https://github.com/scverse/spatialdata-plot/blob/febd424b75f797648b2c7cfcf04368365693f7e1/src/spatialdata_plot/pl/render.py#L256-L261 and https://github.com/scverse/spatialdata-plot/blob/febd424b75f797648b2c7cfcf04368365693f7e1/src/spatialdata_plot/pl/render.py#L277-L279

I think the problem is inherent to the way we render: first, create an image using datashader and then render it as if it was a SpatialImage. These 2 steps lead to fill_alpha being used twice, so I'd say it's not a datashader bug.

If we set the second one to 1, we get the problem again that the shapes will not look transparent if sth else, e.g. an image was rendered before them. See below (e.g. the triangles are not see-through anymore)

grafik

LucaMarconato commented 2 weeks ago

Thanks, clear now. I'd keep things like this for the moment and merge the PR, and think about this later on.

I think the problem can be formulated as follows: consider an image A, an image B (background) and a white image W. From my understanding, the image produced by datashader, let's call it D, is the alpha blend between the image A and W. In formulas it is alpha * A + (1 - alpha) W. The matplotlib image is alpha * A + (1 - alpha) B.

Probably one can find a function f, and a value alpha', such that the alpha blend (using alpha'), between f(D) and B is equal to the alpha blend (using alpha) between A and B.

The above shows that using alpha' = alpha and f = identity doesn't work, but probably the solution can be easily found.

LucaMarconato commented 2 weeks ago

I was lazy to do the math, so I gave a try to the new o1-preview from ChatGPT. It gives the following (I checked the calculations and they are correct):

Can you try it out please? Basically the SpatialImage D should become f(D) this code is called https://github.com/scverse/spatialdata-plot/blob/febd424b75f797648b2c7cfcf04368365693f7e1/src/spatialdata_plot/pl/render.py#L277-L279

Please note that the assumption is that the image produced by datashader has RGB values between 0 and 1, if not, it should be first scaled. If the datashader image has an alpha channel, maybe the formula would not work and we need to adjust the formulation of the problem and find the formula for the new version of the problem.

LucaMarconato commented 2 weeks ago

This is the answer from the bot for future reference.

Sonja-Stockhaus commented 2 weeks ago

I think I got you, thanks for the input! Brain dump from my side:

shouldn't the datashader redering rather be alpha * (alpha * A + (1 - alpha) * W) + (1 - alpha) * B which equals to alpha²*A + (alpha-alpha²)*W + (1-alpha)*B which in turn explains why the image is too pale (since we weigh A with alpha²). If we used e.g. sqrt(alpha) instead of alpha, we would get alpha*A + (sqrt(alpha)-alpha)*W + (1-sqrt(alpha))*B which uses the same weight for A as mpl (alpha*A + (1-alpha)*B), but a different one for B and it still has W in it.

Does that make any sense?

LucaMarconato commented 2 weeks ago

shouldn't the datashader redering rather be alpha (alpha A + (1 - alpha) W) + (1 - alpha) B

Yes, this would be the case with f equal to identity and alpha' equal to alpha. But as you observed, both this case, or the case with alpha' = sqrt(alpha) lead to different results as the one obtained by directly blending A to B without involving W.

scverse / spatialdata-plot

Datashader shapes outlines and fill_alpha documentation #367