napari / napari

napari: a fast, interactive, multi-dimensional image viewer for python
https://napari.org
BSD 3-Clause "New" or "Revised" License
2.12k stars 412 forks source link

New labels annotation tool and tensorstore #6761

Open fjorka opened 3 months ago

fjorka commented 3 months ago

🐛 Bug Report

Using the new annotation tool in the labels layer with tensorstore doesn't provide any feedback when the saving operation is unsuccessful.

💡 Steps to Reproduce

A test showing interaction with a tensorstore array (numpy array for comparison). The array has two slices to easily demonstrate re-reading of the annotation.

import napari
import zarr
import tensorstore as ts
import numpy as np

array_size = (2,2000, 2000)
np_array = np.zeros(array_size, dtype='uint32')    

zarr_path = r'd://example.zarr'
z = zarr.zeros(array_size, chunks=(1,1000, 1000), dtype='uint32')
zarr.save(zarr_path, z)

spec = {
  'driver': 'zarr',
  'kvstore': {
    'driver': 'file', 
    'path': zarr_path,
  },
}

ts_array = ts.open(spec).result()

viewer = napari.Viewer()
viewer.add_labels(ts_array,name='ts')
viewer.add_labels(np_array,name='np')

https://github.com/napari/napari/assets/7549583/e30eb1d6-e657-403f-9584-c590c601597f

💡 Expected Behavior

Maybe an error message that saving to the zarr array failed in this situation?

🌎 Environment

napari: 0.5.0a2.dev606+gb3e15c51 Platform: Windows-10-10.0.19045-SP0 Python: 3.10.13 | packaged by conda-forge | (main, Dec 23 2023, 15:27:34) [MSC v.1937 64 bit (AMD64)] Qt: 5.15.2 PyQt5: 5.15.10 NumPy: 1.26.4 SciPy: 1.12.0 Dask: 2024.3.1 VisPy: 0.14.2 magicgui: 0.8.2 superqt: 0.6.2 in-n-out: 0.2.0 app-model: 0.2.5 npe2: 0.7.4

💡 Additional Context

It's related to the performance and when the classical drawing tool (brush) is responsive, the new tool is working correctly too. Yet, this performance issue seems unavoidable with an HDD drive (in the example that I provide, a small 2k x 2k px array is not performant) and when the brush is used the problem is obvious. In contrast with the new tool, the annotation is displayed correctly but stored incorrectly in the underlying zarr.

psobolewskiPhD commented 3 months ago

I can reproduce this. Using the script in the OP painting in the tensorstore array is slow, but not as bad as in the video. But when I use the polygon labels tool to draw two rectangles (on the lower right) I get two glitches instead.

image

(BTW this approach -- using an overly -> labels was discussed at a community meeting recently, as a way to improve labeling because it could "rasterize" in the background while the user starts the next label)

psobolewskiPhD commented 3 months ago

I tested with a zarr on disk without tensorstore, using: z2 = zarr.open('test_zarr.zarr', mode='w', shape=(2, 2000, 2000), chunks=(1, 1000, 1000), dtype='i4') Performance of painting felt similar to tensorstore and the polygon tool worked!

image
psobolewskiPhD commented 3 months ago

I was playing with this pattern: https://google.github.io/tensorstore/python/api/tensorstore.Transaction.html

Script:

import numpy as np
import napari
import tensorstore as ts
import zarr
array_size = (2,2000, 2000)
np_array = np.zeros(array_size, dtype='uint32')    

zarr_path = r'd://example.zarr'
z = zarr.zeros(array_size, chunks=(1,1000, 1000), dtype='uint32')
zarr.save(zarr_path, z)

spec = {
  'driver': 'zarr',
  'kvstore': {
    'driver': 'file', 
    'path': zarr_path,
  },
}

ts_array = ts.open(spec).result()

txn = ts.Transaction()

viewer = napari.Viewer()
viewer.add_labels(ts_array.with_transaction(txn), name='ts')
viewer.add_labels(np_array,name='np')

Painting is much more performant and glitches with the polygon tool less frequent (but still happening oddly enough):

image

Here's where I did just polygons and only 2 got messed up:

image

But brush painting is fantastic. My understanding is that using the transaction, painting happens in memory with no IO--hence performance. The catch is you need to commit the changes to the array using txn.commit_async() but that will break making changes to the open layer, so you need to re-open the store. Maybe this could be made more graceful via a plugin? Widget with commit button that commits and re-opens with a new transaction?

psobolewskiPhD commented 3 months ago

Playing a bit further, not sure if this is too hacky, but it does work: paint, commit, refresh transaction, point layer data at the new transaction (just refreshing doesn't suffice):

In [13]: txn = ts.Transaction()

In [14]: viewer.add_labels(ts_array.with_transaction(txn))
Out[14]: <Labels layer 'Labels' at 0x2af2b0e90>

In [15]: txn.commit_async()
Out[15]: <tensorstore.Future at 0x2ae3bdd80>

In [16]: txn = ts.Transaction()

In [17]: viewer.layers[0].data = ts_array.with_transaction(txn)

In [18]: txn.commit_async()
Out[18]: <tensorstore.Future at 0x2b1d61d80>

In [19]: txn = ts.Transaction()

In [20]: viewer.layers[0].data = ts_array.with_transaction(txn)

(for painting it seems to be good, and I think it could be made into a widget. the polygon tool is still unpredictable though)

psobolewskiPhD commented 3 months ago

I assume you could go on painting indefinitely before doing the commit, so it's actually not bad from a UX point of view. It's just save basically, consisting of commit, make a new transaction, re-home the layer.data. But too specific for napari to handle, still fine for a plugin/script/notebook. CC: @jni not sure who else uses tensorstore and my google-fu hasn't come up with much in this area.

jni commented 3 months ago

@psobolewskiPhD I haven't played that much with Tensorstore beyond getting plain-old-paint to work, and in fact you are now the expert. 😂 But that exploration seems very interesting. I've thought about a general approach here in which we paint only the view pixels synchronously, and paint into the data asynchronously.

Anyway, the performance issue is, I think, secondary to the polygon glitching issue. I wonder whether it's related to my issues with triangulation in #6654 #5673. I don't think the two issues are related, though it's interesting that you can reduce the glitching by using transactions...

My suspicion is that there's something a bit weird going on with tensorstore indexing — it is a little bit different from NumPy indexing in that it maintains the index of sliced dimensions:

In [4]: arr = ts.array(np.arange(5*10*12).reshape((5, 10, 12)))

In [5]: b = arr[1:3]

In [6]: b[0]
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
Cell In[6], line 1
----> 1 b[0]

IndexError: OUT_OF_RANGE: Checking bounds of constant output index map for dimension 0: Index 0 is outside valid range [1, 3) [domain='{origin={1, 0, 0}, shape={2, 10, 12}}'] [transform='Rank 2 -> 3 index space transform:   Input domain:     0: [0, 10)     1: [0, 12)   Output index maps:     out[0] = 0     out[1] = 0 + 1 * in[0]     out[2] = 0 + 1 * in[1] '] [source locations='tensorstore/index_space/internal/propagate_bounds.cc:109\ntensorstore/index_space/internal/propagate_bounds.cc:109\ntensorstore/index_space/internal/compose_transforms.cc:113\ntensorstore/index_space/index_transform.h:106']

In [7]: b[1]
Out[7]:
TensorStore({
  'array': [
    [120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131],
    [132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143],
    [144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155],
    [156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167],
    [168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179],
    [180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191],
    [192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203],
    [204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215],
    [216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227],
    [228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239],
  ],
  'context': {'data_copy_concurrency': {}},
  'driver': 'array',
  'dtype': 'int64',
  'transform': {'input_exclusive_max': [10, 12], 'input_inclusive_min': [0, 0]},
})