ome / napari-ome-zarr

A napari plugin for zarr backed OME-NGFF images
https://www.napari-hub.org/plugins/napari-ome-zarr
BSD 3-Clause "New" or "Revised" License
26 stars 21 forks source link

Handle AnnData regions tables #85

Open will-moore opened 1 year ago

will-moore commented 1 year ago

With https://github.com/ome/ome-zarr-py/pull/256 (work in progress) we hope to have AnnData tables available to napari-ome-zarr.

I've been using this to read "points" tables and display as tracks: https://github.com/ome/napari-ome-zarr/pull/81

So now I also want to handle "regions" tables in napari.

I've been looking at the https://github.com/kevinyamauchi/ome-ngff-tables-prototype/blob/main/examples/save_load_squidpy_segment.py example.

The logic as I follow it goes:

Then you can use https://www.napari-hub.org/plugins/napari-properties-viewer to show a table of features for each label:

In this screenshot, I edited the code to use cell_id for color rather than a random column:

Screenshot 2023-03-17 at 10 19 12

I was thinking of doing something similar in napari-ome-zarr with the following differences:

cc @kevinyamauchi @LucaMarconato @giovp

LucaMarconato commented 1 year ago

Very cool to see this update coming to napari-ome-zarr!

A few comments on the procedure you described. In napari_spatialdata we follow a similar approach to map the labels (matching the IDs, sorting the rows, etc), but we don't add extra rows for labels not in the table. We simply don't annotate them (for instance if the background is not in the table, then the background will be transparent, and we give a warning for labels that are not background and that are not in the table).

You can see this in action by 1) cloning spatialdata-sandbox and the spatialdata branch of napari_spatialdata, 2) going to the mibitof folder, 3) running first download.py and then to_zarr.py 4) and finally launching python -m napari_spatialdata view data.zarr.

The relevant parts of the code are here (matching the AnnData table with the labels) and here (creating the labels colored layer). One disclaimer, the code in that branch works but is messy, we want to fix it soon but we don't have time right now.

LucaMarconato commented 1 year ago

Another comment on using the table to store other types of geometries: I think that storing points, circles, etc in tables should be considered still as in a prototype phase.

In the ome-ngff-tables-prototype repo we tried to use the table to store points, circles and even (in a hacky way) polygons, as showed in the merfish example, even if the NGFF Table specs doesn't say anything about how to store coordinates in a table, but after these tests, we decided to use tables (in spatialdata) only for storing annotations but not spatial coordinates/geometries. We still use table to annotate features on circles, polygons (so that in a downstream analysis it doesn't matter if, say, cells, are represented as polygons or labels), but we don't allow tables to be in a coordinate system. We instead save the coordinates in .parquet files or ragged representations of polygons saved to Zarr, and we deal with io with libraries like geopandas, dask (dataframes) and dask-geopandas.

In the future we should all share our gained experiences when discussing how to store coordinates, geometries and ROIs in term of future NGFF specs.

will-moore commented 1 year ago

Hi Luca, I just tried the spatialdata-sandbox/mibitof download and to_zarr.py....

(spatialdata) Williams-MacBook-Pro:spatialdata-sandbox wmoore$ cd mibitof/
(spatialdata) Williams-MacBook-Pro:mibitof wmoore$ python download.py 
(spatialdata) Williams-MacBook-Pro:mibitof wmoore$ python to_zarr.py 
Traceback (most recent call last):
  File "/Users/wmoore/Desktop/SPATIALDATA/spatialdata-sandbox/mibitof/to_zarr.py", line 6, in <module>
    from spatialdata import SpatialData
ImportError: cannot import name 'SpatialData' from 'spatialdata' (unknown location)

I have the following versions:

$ pip freeze | grep spatialdata
-e git+https://github.com/scverse/napari-spatialdata@2716a406f28dcf889474ab60671cde360c7327a2#egg=napari_spatialdata
-e git+ssh://git@github.com/scverse/spatialdata.git@8266a0f4a2d6a3ef7373b4f1f855de4a7b3c184c#egg=spatialdata
-e git+https://github.com/scverse/spatialdata-io@55ca02d52030757da62bc69802e34a15e28aa70d#egg=spatialdata_io
-e git+ssh://git@github.com/scverse/spatialdata-notebooks.git@6c54e94874d96450c7a1453f68d53445a7fb0ea0#egg=spatialdata_notebooks

Previously before I updated to latest spatialdata commit, I got:

Traceback (most recent call last):
  File "/Users/wmoore/Desktop/SPATIALDATA/spatialdata-sandbox/mibitof/to_zarr.py", line 7, in <module>
    from spatialdata.transformations import Identity
ModuleNotFoundError: No module named 'spatialdata.transformations'

when at commit:

commit 85a813b285d21fed51405f3fd5cd46f17b3017f9 (HEAD)
Merge: 848e063 72d4732
Author: LucaMarconato <2664412+LucaMarconato@users.noreply.github.com>
Date:   Fri Mar 10 00:35:03 2023 +0100
    Merge pull request #183 from scverse/some_docstrings

Any idea what branch/commit I need to use here for all those repos?

will-moore commented 1 year ago

@LucaMarconato I don't know if you've seen https://github.com/ome/ngff/issues/178 which discusses other table formats as part (or NOT) of the NGFF spec, including parquet?

@kkyoda is using NGFF AnnData tables to store tracks (e.g. see https://github.com/openssbd/bdz/pull/2) and I have been looking to handle that in napari-ome-zarr to display tracks in napari - see #81

But if you prefer to store points & tracks in parquet then we're already seeing divergence on this and it would be good to converge so we don't waste time on different solutions.

What are the advantages with parquet for that data, compared with AnnData? Maybe add to the discussion at https://github.com/ome/ngff/issues/178? I'm not at-all familiar with parquet, and I don't see any maintained JavaScript tools for reading the data, which is a shame but maybe not a blocker.

LucaMarconato commented 1 year ago

Any idea what branch/commit I need to use here for all those repos?

Hi, I think the problem is due to the fact that the version in pip is not the current one (we are not updating pip regularly since we haven't released yet). If you do an editable install of the main branch it should work.

Additionally, please mind the following two points:

LucaMarconato commented 1 year ago

Initially we drafted the table specification to also specify how to store points and other geometries https://github.com/ome/ngff/pull/64#issuecomment-1133393761. But later we decided to simplify it and focus solely on how to use tables to store annotations for labels https://github.com/ome/ngff/pull/64/commits/f8f2fd0f386d779ed153caa4cebed8535bd5ec71.

This is because otherwise the table specification would have required to define transformation and coordinate systems for the coordinates, but the transforms specs are still being discussed, so we wanted to be decoupled from it and rather iterate later on.

As a result of not specifying how to store geometries in the table specification, we selected the storage method that best suited development and runtime efficiency. Currently, we use .parquet files for points and ragged arrays saved to Zarr for circles and polygons. The first makes lazy loading the data possible, the second is convenient for io with geopandas.

After the first iteration of development is complete, and the transformation specification is finalized, we would like to re-discuss how to store points and shapes, agree on a representation and change our io APIs.

will-moore commented 1 year ago

I'm not installing anything via pypi. I have checked-out:

https://github.com/scverse/napari-spatialdata/tree/spatialdata (2716a40 ) https://github.com/scverse/spatialdata-io (23be385) https://github.com/scverse/spatialdata (8266a0f) https://github.com/scverse/spatialdata-notebooks (d9cfe01)

So I get:

$ pip freeze | grep spatialdata
-e git+https://github.com/scverse/napari-spatialdata@2716a406f28dcf889474ab60671cde360c7327a2#egg=napari_spatialdata
-e git+ssh://git@github.com/scverse/spatialdata.git@8266a0f4a2d6a3ef7373b4f1f855de4a7b3c184c#egg=spatialdata
-e git+https://github.com/scverse/spatialdata-io@23be3852d650b34cc0ae5f4a91ea67acfec11839#egg=spatialdata_io
-e git+ssh://git@github.com/scverse/spatialdata-notebooks.git@d9cfe01cca258b45a34fad8e6db6dd5f5f594a25#egg=spatialdata_notebooks

I'm on this branch of spatialdata-sandbox: * b7a728a (HEAD, origin/main, origin/HEAD) finished viz for data

I still see:

$ python to_zarr.py 
Traceback (most recent call last):
  File "/Users/wmoore/Desktop/SPATIALDATA/spatialdata-sandbox/mibitof/to_zarr.py", line 6, in <module>
    from spatialdata import SpatialData
ImportError: cannot import name 'SpatialData' from 'spatialdata' (unknown location)

I get the same import Error from a different location if I try:

python -m napari_spatialdata view data.zarr

Traceback (most recent call last):
  File "/Users/wmoore/opt/anaconda3/envs/spatialdata/lib/python3.10/runpy.py", line 187, in _run_module_as_main
    mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
  File "/Users/wmoore/opt/anaconda3/envs/spatialdata/lib/python3.10/runpy.py", line 146, in _get_module_details
    return _get_module_details(pkg_main_name, error)
  File "/Users/wmoore/opt/anaconda3/envs/spatialdata/lib/python3.10/runpy.py", line 110, in _get_module_details
    __import__(pkg_name)
  File "/Users/wmoore/Desktop/SPATIALDATA/napari-spatialdata/src/napari_spatialdata/__init__.py", line 15, in <module>
    from napari_spatialdata.interactive import Interactive
  File "/Users/wmoore/Desktop/SPATIALDATA/napari-spatialdata/src/napari_spatialdata/interactive.py", line 17, in <module>
    from spatialdata import SpatialData, get_axis_names
ImportError: cannot import name 'SpatialData' from 'spatialdata' (unknown location)

Strangely, looking at the local code for spatialdata it looks like the SpatialData class should be importable, so I'm not sure what's going on...

will-moore commented 1 year ago

OK - so I created a new conda env and reinstalled everything... The installed versions look the same as above, but the import is working this time!

conda create -n spatialdata310 python=3.10
conda activate spatialdata310
cd spatialdata
pip install -e .
cd ../napari-spatialdata/
pip install -e .
cd ../spatialdata-io
pip install -e .
cd spatialdata-notebooks/
pip install -e .

 pip freeze | grep spatialdata
-e git+https://github.com/scverse/napari-spatialdata@2716a406f28dcf889474ab60671cde360c7327a2#egg=napari_spatialdata
spatialdata @ git+https://github.com/scverse/spatialdata.git@8266a0f4a2d6a3ef7373b4f1f855de4a7b3c184c
-e git+https://github.com/scverse/spatialdata-io@23be3852d650b34cc0ae5f4a91ea67acfec11839#egg=spatialdata_io
-e git+ssh://git@github.com/scverse/spatialdata-notebooks.git@d9cfe01cca258b45a34fad8e6db6dd5f5f594a25#egg=spatialdata_notebooks

$ python
Python 3.10.10 (main, Mar 21 2023, 13:41:39) [Clang 14.0.6 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from spatialdata import SpatialData
>>>
will-moore commented 1 year ago

But...

$ cd spatialdata-sandbox/mibitof/
$ python download.py
$ python to_zarr.py

$ python -m spatialdata view data.zarr
/Users/wmoore/opt/anaconda3/envs/spatialdata310/lib/python3.10/site-packages/geopandas/_compat.py:123: UserWarning: The Shapely GEOS version (3.11.1-CAPI-1.17.1) is incompatible with the GEOS version PyGEOS was compiled with (3.10.4-CAPI-1.16.2). Conversions between both will be slow.
  warnings.warn(
/Users/wmoore/opt/anaconda3/envs/spatialdata310/lib/python3.10/site-packages/spatialdata/__init__.py:9: UserWarning: Geopandas was set to use PyGEOS, changing to shapely 2.0 with:

    geopandas.options.use_pygeos = True

If you intended to use PyGEOS, set the option to False.
  _check_geopandas_using_shapely()
Usage: python -m spatialdata [OPTIONS] COMMAND [ARGS]...
Try 'python -m spatialdata --help' for help.

Error: No such command 'view'.
LucaMarconato commented 1 year ago

Solved via chat, it was a typo in the info showed in print(), the correct command is python -m napari_spatialdata view data.zarr

will-moore commented 1 year ago

Trying that...

$ python -m napari_spatialdata view data.zarr

Traceback (most recent call last):
  File "/Users/wmoore/opt/anaconda3/envs/spatialdata310/lib/python3.10/runpy.py", line 187, in _run_module_as_main
    mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
  File "/Users/wmoore/opt/anaconda3/envs/spatialdata310/lib/python3.10/runpy.py", line 146, in _get_module_details
    return _get_module_details(pkg_main_name, error)
  File "/Users/wmoore/opt/anaconda3/envs/spatialdata310/lib/python3.10/runpy.py", line 110, in _get_module_details
    __import__(pkg_name)
  File "/Users/wmoore/Desktop/SPATIALDATA/napari-spatialdata/src/napari_spatialdata/__init__.py", line 15, in <module>
    from napari_spatialdata.interactive import Interactive
  File "/Users/wmoore/Desktop/SPATIALDATA/napari-spatialdata/src/napari_spatialdata/interactive.py", line 17, in <module>
    from spatialdata import SpatialData, get_axis_names
ImportError: cannot import name 'get_axis_names' from 'spatialdata' (/Users/wmoore/Desktop/SPATIALDATA/spatialdata/src/spatialdata/__init__.py)
LucaMarconato commented 1 year ago

Can you please do import spatialdata; print(spatialdata.__path__)? I think that it will give you a path in site-packages and not the one in which you cloned the repo, because I think that this command

cd ../spatialdata-io
pip install -e .

tries to override the editable installation even if it is there.

LucaMarconato commented 1 year ago

Ah! There was a problem introduced by a recent pr. I will fix it. Please notice that (at least on my machine), running the commands described in https://github.com/ome/napari-ome-zarr/issues/85#issuecomment-1490381773 will reinstall spatialdata from Github. To restore the editable install I had to do:

pip uninstall spatialdata
# cd spatialdata repo
pip install -e .
will-moore commented 1 year ago

Yes, I had already found that I needed to re-install from source! Strange...

will-moore commented 1 year ago
Screenshot 2023-04-03 at 08 25 59
LucaMarconato commented 1 year ago

Great that it works! One extra info, we uploaded (and we are keeping up-to-date) the various datasets already converted to NGFF in a S3 storage; you can find the URLs here.

We haven't tested loading the data from the cloud yet (I think we have to fix the consolidated metadata). I'll keep you posted, but if you want to also experiment, any feedback would be appreciated 😊