tlambert03 / nd2

Full-featured nd2 (Nikon NIS Elements) file reader for python. Outputs to numpy, dask, and xarray. Exhaustive metadata extraction
https://tlambert03.github.io/nd2
BSD 3-Clause "New" or "Revised" License
53 stars 15 forks source link

Use resource-backed-dask-array, copy dask chunks by default #40

Closed tlambert03 closed 2 years ago

tlambert03 commented 2 years ago

pull out that module into it's own repo https://github.com/tlambert03/resource-backed-dask-array

This also always uses array.copy() in the _dask chunk function. It's conceivable may slow down certain read operations, but there were still some lingering segfaults that I can't figure out. (added a test for one). Better safe than sorry for now. We can revisit performance in the future (or, users can play with to_dask(copy=False))

codecov[bot] commented 2 years ago

Codecov Report

Merging #40 (bf70656) into main (49a09c9) will decrease coverage by 1.18%. The diff coverage is 85.71%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main      #40      +/-   ##
==========================================
- Coverage   87.40%   86.21%   -1.19%     
==========================================
  Files          11       10       -1     
  Lines        1175     1110      -65     
==========================================
- Hits         1027      957      -70     
- Misses        148      153       +5     
Impacted Files Coverage Δ
src/nd2/nd2file.py 93.86% <77.77%> (+1.88%) :arrow_up:
src/nd2/_sdk/latest.pyx 76.71% <100.00%> (-6.47%) :arrow_down:
src/nd2/_legacy.py 84.09% <0.00%> (+0.90%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 49a09c9...bf70656. Read the comment docs.

tlambert03 commented 2 years ago

hey, @jacksonmaxfield

Small heads up here. In addition to pulling out that resource-backed array thingy, this PR is largely for improved compatibility with aicsimagio. After a long time playing with it, I can't quite figure out what about that np.test_array_equal bit to causing a segfault, so I'm falling back to the safer behavior of using np.copy on dask chunks by default. This definitely fixes that segfault, but may decrease dask performance a bit. Better safe than sorry, and users can still futz with copy=False if they want.

will release and make PR to aicsimageio

tlambert03 commented 2 years ago

oh, note also that I'm adding a direct aicsimageio test here as well, so we should be safe to pin >= from now on, as this will fail if it fails the aics test

tlambert03 commented 2 years ago

actually @VolkerH as well, I hope that this will eliminate the remaining segfaults you were seeing (at the cost of a little performance, but there's a copy param you can turn off if you want to be daring ;))

VolkerH commented 2 years ago

Thanks for tagging me @tlambert03, I hope to give this a try soon.

VolkerH commented 2 years ago

Hi @tlambert03,

sorry for the late feedback. I just upgraded to the latest packaged version (0.2.1) and the one of my tests that resulted in a segmentation fault with 0.1.6 now passes:

def _load_nd2(filepath):
    with ND2File(filepath) as f:
        return f.to_dask()

@pytest.mark.skipif(RUNNING_IN_CI, reason="Test resource missing")
def test_nd2_from_context_mgr():
    arr = _load_nd2(dataset_nd2)
    np_slice = np.array(arr[0, 0, :, :])

Thanks for fixing this !

tlambert03 commented 2 years ago

thanks for the update!