saeyslab / napari-sparrow

Other
18 stars 0 forks source link

Make it possible to restart at every step #99

Closed lopollar closed 1 year ago

lopollar commented 1 year ago

Now it isn't possible to restart at every step, which makes everything slow.

Multiple problems with the anndata:

-Saved in two separate files (h5ad and geojson): function needs to written to add them together again when reading in

Then in the code on napari: for every step add load and save option

ArneDefauw commented 1 year ago

Due to issue in spatialdata https://github.com/scverse/spatialdata/issues/186, it not possible to restart at every step (i.e. because we are not allowed to overwrite the .zarr store which holds the spatial data object in order to prevent accidental data loss on disk).

lopollar commented 1 year ago

Would it be possible to read in the object and go on with the analysis? To write a function that sees which steps have been performed (by looking at the objects), and then performing the next steps on the same object?

ArneDefauw commented 1 year ago

There are two issues in spatialdata preventing us to do this restarting at every step.

1) https://github.com/scverse/spatialdata/issues/186 . We are not allowed to overwrite the .zarr store which holds the spatial data object. But I think this is the behaviour we want to prevent accidental data loss. However, this issue/feature of spatial data can be 'circumvent' by the user, simply by specifying another output layer. E.g. for tiling correction:

sdata, flatfield = fc.tilingCorrection( sdata=sdata, crop_param=crop_param, output_layer="tiling_correction" )

sdata will now contain the image 'tiling_correction', same for .zarr store. But we are allowed to do:

sdata, flatfield = fc.tilingCorrection( sdata=sdata, crop_param=crop_param, output_layer="tiling_correction_2" )

now sdata will contain the images 'tiling_correction' and 'tiling_correction_2'. So in a way this issue in spatial data does not prevent us from restarting at every step.

2) Issue two relates to the coordinates. They are not preserved when saving to the .zarr store. I.e. when we do this

sdata = fc.create_sdata(
    filename_pattern=path_image,
    output_path=os.path.join(OUTPUT_DIR, "sdata.zarr"),
    layer_name=layer_name,
    chunks=1024,
)

then

sdata[ 'raw_image' ].x.data is equal to

array([0.0000e+00, 1.0000e+00, 2.0000e+00, ..., 1.0717e+04, 1.0718e+04,
       1.0719e+04])

however when we do

from spatialdata import read_zarr
sdata_load=read_zarr( os.path.join( OUTPUT_DIR, 'sdata.zarr' ) )

then sdata_load[ 'raw_image' ].x.data

is equal to

array([5.00000e-01, 1.50000e+00, 2.50000e+00, ..., 1.07175e+04,
       1.07185e+04, 1.07195e+04])

this is especially an issue when working with cropped images.

So as long as we do not read explicitely from the zarr store, all is fine, and the different steps can be rerun as long as we specify an output layer that is not yet used. However when we read from .zarr store and want to restart the analysis, due to this issue with the coordinates, we would have to set the correct coordinates before restarting the analysis.

SilverViking commented 1 year ago

I suspect that the philosophy of SpatialData for dealing with crops is by adding a coordinate transform (a translation) on the image, instead of tweaking the coordinates array of the SpatialImage. These coordinate transformations get saved and loaded correctly to and from zarr, and saving to zarr also does not ruin the coordinate transformations of the sdata object in memory (whereas indeed the coords arrays simply get reset).

The spatial query notebook seems to illustrate that idea: using SpatialData.query() to create a crop of a SpatialData object automatically introduces a Translation but does not modify the image coordinates array. See the last section "transformations are preserved after spatial query".

ArneDefauw commented 1 year ago

Related to this issue, we needed to do a fix to make sure the anndata object in sdata.table stays in sync with the sdata.table in the .zarr store. This is now the case via this commit https://github.com/saeyslab/napari-sparrow/commit/476f6e7bfbd2eaf643f7e3cb164e282a9d3718fe .

SilverViking commented 1 year ago

Image crops (saving to/loading from zarr) are now supported, via commit https://github.com/saeyslab/napari-sparrow/commit/4df7ac250c473f20bb07062aee64bb06f7386bb5.

ArneDefauw commented 1 year ago

This should be fixed via https://github.com/saeyslab/napari-sparrow/commit/476f6e7bfbd2eaf643f7e3cb164e282a9d3718fe and https://github.com/saeyslab/napari-sparrow/commit/4df7ac250c473f20bb07062aee64bb06f7386bb5