scverse / spatialdata

An open and interoperable data framework for spatial omics data
https://spatialdata.scverse.org/
BSD 3-Clause "New" or "Revised" License
226 stars 43 forks source link

Offset when rasterizing data #165

Open LucaMarconato opened 1 year ago

LucaMarconato commented 1 year ago

When rasterizing images or labels (basically resampling), there is an offset (see picture) that I think is due to the fact that somewhere in the code for rasterization the coordinates of the pixels are considered to be associated with the top left corner, and in other parts of the code for rasterization the coordinates refer to the center of the pixels.

This half pixel difference is not negligible since when we align together images of different resolutions, one pixel for one image can be many pixels for another.

The screenshot is from this example: https://github.com/scverse/spatialdata/blob/eb99aebbdb4f7b7ee69d9663c5c996f5fa8f0440/examples/dev-examples/spatial_query_and_rasterization.py#L15

image

LucaMarconato commented 1 year ago

Another important point, the rasterization functions could be used for computing aggregation functions. Until this offset problem is addressed, the aggregation would be imprecise.

LucaMarconato commented 8 months ago

The function transform_to_data_extent() (which is coming soon in a new PR) is affected by this bug. I had to create a test with permissive tolerance, which reads as assert np.allclose(a, first_a, rtol=0.005).

LucaMarconato commented 8 months ago

The bug is not just of rasterize, but multiple factors are involved. First, napari considers the coordinates as the center of pixels. Here is for instance the effect of rotating an image by -45 degrees, and showing also the (0, 0) coordinate before and after rotation. Note that the rotation doesn't preserve the location of the (0, 0)-pixel top-left corner.

I need to check, but I suspect that ndinterp() used in transform() and rasterize() may not operate as in napari, leading to some misalignment.

image

LucaMarconato commented 8 months ago

I confirm that one of the reasons for the bug is the fact that (0, 0) is not the top-left corner but the center of the top-left pixel, as shown in the image above. I am fixing this in #453. Nevertheless, there is also another problem, apparently due to ndinterp.affine_transform(): https://github.com/scipy/scipy/issues/20038.

LucaMarconato commented 7 months ago

After digging deeper into this I found out the causes of the bug, some of which can be fixed (as I did partially did in #453), but one of them requires a design choice to be made. I will explain this here.

The bugs are the following (they all lead to an error to up to 1 pixel, which is noticeable when the pixel sizes are large):

  1. ndinterp.affine_transform() adds a black border padding. Tracked here https://github.com/scipy/scipy/issues/20038. This is probably due to the interpolation algorithm, even if theoretically, since we don't use interpolation, this could be avoided. A workaround (computationally expensive) could be an optional flag that induces a padding of the image before calling ndinterp.affine_transform(), and followed by a appropriate cropping. Anyway, even if there is a noticeable border padding added, the alignment of the non-black pixels is not compromised.
  2. The bounding box query function adds the wrong translation vector, this is fixed in #453 (not merged).
  3. The functions rasterize() and _transform_raster(), which both call ndinterp.affine_transform(), need to adjust the transformation matrix to account to how a raster image is positioned with respect to the coordinates of the cartesian axes. Practically, once the point 4. below will be addressed, we have to adjust the implementation. Also, the offset that is added to the rasterized image (in rasterize()) and to the transformed image (in _set_transformation_for_transformed_elements()), needs to take into account for this).

The design choice to be made is due to the following:

  1. We have to choose where the center of the pixel (0, 0) is, with respect to the cartesian point (0, 0). Napari chooses to (a) have the center of the pixel (0, 0) in the point (0, 0), I propose instead to (b) have it in (0.5, 0.5).

image

@timtreis @Sonja-Stockhaus I kindly ask for feedback, how does matplotlib operates? With the approach (a) or (b)?


Some example implications of the case (a) are the following:

Some implications of the case (b) are the following:

LucaMarconato commented 7 months ago

The approach (a) is motivated by http://alvyray.com/Memos/CG/Microsoft/6_pixel.pdf. After reading this (just done) I am more inclined for this approach. We should carefully what this implies for the result of the transform() and rasterize() APIs, and how matplotlib operates before continuing.

timtreis commented 7 months ago

Using extent and origin I can pretty much place the pixel anywhere I want but the default also seems to be the center of the pixel, like Napari. I personally also find this more intuitive.

import matplotlib.pyplot as plt
import numpy as np

matrix = np.array([[2]])

plt.imshow(matrix, cmap='gray')
plt.grid(color='red', linestyle='-', linewidth=2)
plt.xticks(np.arange(-1, 2))
plt.yticks(np.arange(-1, 2))

plt.show()

image

LucaMarconato commented 7 months ago
LucaMarconato commented 4 months ago
LucaMarconato commented 3 months ago