Closed tlambert03 closed 2 years ago
Thanks! I will have a look first thing in the morning (my time).
On Thu, 11 Nov 2021, 19:09 Talley Lambert, @.***> wrote:
this includes #26 https://github.com/tlambert03/nd2/pull/26 ... and modifies the get/set state accordingly to deal with the new lock object.
@VolkerH https://github.com/VolkerH ... after #26 https://github.com/tlambert03/nd2/pull/26, have a look here
You can view, comment on, or merge this pull request online at:
https://github.com/tlambert03/nd2/pull/27 Commit Summary
- new array subclass https://github.com/tlambert03/nd2/pull/27/commits/754ab1f7fb05c93b0b5c15f9275fc90f71d93aad
- Merge branch 'main' into fix-dask-dispatch https://github.com/tlambert03/nd2/pull/27/commits/44bc35ac42f09f41b396b6eb3fca1bd0d9d833e8
- pickle test https://github.com/tlambert03/nd2/pull/27/commits/cab55e227f8ad7d98951c2c2a9802ed15197a6ef
File Changes
(8 files https://github.com/tlambert03/nd2/pull/27/files)
- M setup.cfg https://github.com/tlambert03/nd2/pull/27/files#diff-fa602a8a75dc9dcc92261bac5f533c2a85e34fcceaff63b3a3a81d9acde2fc52 (2)
- D src/nd2/_dask_proxy.py https://github.com/tlambert03/nd2/pull/27/files#diff-89a4af5cbbcd5d72d4fcbe2b06a4407ae875336e88aed9196345a4c84e5103d6 (89)
- M src/nd2/nd2file.py https://github.com/tlambert03/nd2/pull/27/files#diff-2b4fa881170f0185c9e1dde01a0a029c5dfdf995bb0450f5b6f7a958be721f33 (81)
- A src/nd2/opening_dask_array.py https://github.com/tlambert03/nd2/pull/27/files#diff-546a41a15ebe9288dd758edb3ba6b8cfea7d54e19372cf48e7a915a9d38f8c38 (126)
- M tests/conftest.py https://github.com/tlambert03/nd2/pull/27/files#diff-e52e4ddd58b7ef887ab03c04116e676f6280b824ab7469d5d3080e5cba4f2128 (7)
- A tests/test_dask_dispatch.py https://github.com/tlambert03/nd2/pull/27/files#diff-d4e9d8205f51d354667a389e971d58a2c7d4490fd624f251cbc5954ff45ba035 (58)
- M tests/test_dask_proxy.py https://github.com/tlambert03/nd2/pull/27/files#diff-5bd88b24bdd4a4a01e6fb01eaedb3e8dbee153b1b030c0b931681e16378193de (34)
- M tests/test_reader.py https://github.com/tlambert03/nd2/pull/27/files#diff-88ac3b584a5505d52ad940bf74289789a177ce60ae6852ba4d245b362612afd1 (20)
Patch Links:
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/tlambert03/nd2/pull/27, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAQ3R7DEMKGZGHXQWUHCSKDULQBDJANCNFSM5H3BMQGQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
Merging #27 (8cfda51) into main (1e2d45b) will decrease coverage by
0.06%
. The diff coverage is93.93%
.:exclamation: Current head 8cfda51 differs from pull request most recent head 25a0514. Consider uploading reports for the commit 25a0514 to get more accurate results
@@ Coverage Diff @@
## main #27 +/- ##
==========================================
- Coverage 87.29% 87.23% -0.07%
==========================================
Files 11 11
Lines 1149 1159 +10
==========================================
+ Hits 1003 1011 +8
- Misses 146 148 +2
Impacted Files | Coverage Δ | |
---|---|---|
src/nd2/_sdk/latest.pyx | 84.21% <50.00%> (-0.34%) |
:arrow_down: |
src/nd2/resource_backed_array.py | 96.77% <92.30%> (ø) |
|
src/nd2/nd2file.py | 91.89% <100.00%> (+0.06%) |
:arrow_up: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 1e2d45b...25a0514. Read the comment docs.
Ooops .... I was too quick. While my tests pass, my actual application still crashes when I return the dask array from a context manager. I will try and debug where that happens and provide a minimal example.
So what I see in the context of a larger application (shortened traceback) when returning the dask array from a ND2File context manager. When I leave ND2File open and return the array, I don't observe the crashes.
When I try and create a smaller example consisting of the following two steps
np.array(da_returned_from_context_manager[0,0,:,:])
it doesn't crash. Although these are exactly the steps that happen in the context of the larger application.
Fatal Python error: Segmentation fault
Thread 0x00007f26dba25700 (most recent call first):
File "/home/hilsenst/miniconda3/envs/napari_latest/lib/python3.9/concurrent/futures/thread.py", line 75 in _worker
File "/home/hilsenst/miniconda3/envs/napari_latest/lib/python3.9/threading.py", line 892 in run
File "/home/hilsenst/miniconda3/envs/napari_latest/lib/python3.9/threading.py", line 954 in _bootstrap_inner
File "/home/hilsenst/miniconda3/envs/napari_latest/lib/python3.9/threading.py", line 912 in _bootstrap
File "/home/hilsenst/.vscode/extensions/ms-python.python-2021.11.1422169775/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydev_bundle/pydev_monkey.py", line 1054 in __call__
Thread 0x00007f275e7fc700 (most recent call first):
Now I can reproduce it with a small example
my test_nd2.py
import dask.array as da
import numpy as np
from nd2 import ND2File
# TODO: create a small test .nd2 on the Nikon that we can add to resources
# or add something to git lfs
dataset_nd2 = "/home/hilsenst/Documents/Luisa_Reference_HT/PreMaldi/Seq0000.nd2"
# removed a number of tests here for brevity
...
def _load_nd2(filepath):
with ND2File(filepath) as f:
return f.to_dask()
def test_nd2_from_context_mgr():
arr = _load_nd2(dataset_nd2)
np_slice = np.array(arr[0, 0, :, :])
Then runninng the test:
tests/test_nd2.py ....Fatal Python error: Segmentation fault
Thread 0x00007f4c0ecff700 (most recent call first):
File "/home/hilsenst/miniconda3/envs/napari_latest/lib/python3.9/concurrent/futures/thread.py", line 75 in _worker
File "/home/hilsenst/miniconda3/envs/napari_latest/lib/python3.9/threading.py", line 892 in run
File "/home/hilsenst/miniconda3/envs/napari_latest/lib/python3.9/threading.py", line 954 in _bootstrap_inner
File "/home/hilsenst/miniconda3/envs/napari_latest/lib/python3.9/threading.py", line 912 in _bootstrap
Current thread 0x00007f4c45f31180 (most recent call first):
File "/home/hilsenst/GitlabEMBL/spacem-ht/src/spacem-mosaic/tests/test_nd2.py", line 53 in test_nd2_from_context_mgr
File "/home/hilsenst/miniconda3/envs/napari_latest/lib/python3.9/site-packages/_pytest/python.py", line 183 in pytest_pyfunc_call
File "/home/hilsenst/miniconda3/envs/napari_latest/lib/python3.9/site-packages/pluggy/callers.py", line 187 in _multicall
File "/home/hilsenst/miniconda3/envs/napari_latest/lib/python3.9/site-packages/pluggy/manager.py", line 84 in <lambda>
File "/home/hilsenst/miniconda3/envs/napari_latest/lib/python3.9/site-packages/pluggy/manager.py", line 93 in _hookexec
File "/home/hilsenst/miniconda3/envs/napari_latest/lib/python3.9/site-packages/pluggy/hooks.py", line 286 in __call__
File "/home/hilsenst/miniconda3/envs/napari_latest/lib/python3.9/site-packages/_pytest/python.py", line 1641 in runtest
File "/home/hilsenst/miniconda3/envs/napari_latest/lib/python3.9/site-packages/_pytest/runner.py", line 162 in pytest_runtest_call
File "/home/hilsenst/miniconda3/envs/napari_latest/lib/python3.9/site-packages/pluggy/callers.py", line 187 in _multicall
File "/home/hilsenst/miniconda3/envs/napari_latest/lib/python3.9/site-packages/pluggy/manager.py", line 84 in <lambda>
File "/home/hilsenst/miniconda3/envs/napari_latest/lib/python3.9/site-packages/pluggy/manager.py", line 93 in _hookexec
File "/home/hilsenst/miniconda3/envs/napari_latest/lib/python3.9/site-packages/pluggy/hooks.py", line 286 in __call__
File "/home/hilsenst/miniconda3/envs/napari_latest/lib/python3.9/site-packages/_pytest/runner.py", line 255 in <lambda>
File "/home/hilsenst/miniconda3/envs/napari_latest/lib/python3.9/site-packages/_pytest/runner.py", line 311 in from_call
File "/home/hilsenst/miniconda3/envs/napari_latest/lib/python3.9/site-packages/_pytest/runner.py", line 254 in call_runtest_hook
File "/home/hilsenst/miniconda3/envs/napari_latest/lib/python3.9/site-packages/_pytest/runner.py", line 215 in call_and_report
File "/home/hilsenst/miniconda3/envs/napari_latest/lib/python3.9/site-packages/_pytest/runner.py", line 126 in runtestprotocol
File "/home/hilsenst/miniconda3/envs/napari_latest/lib/python3.9/site-packages/_pytest/runner.py", line 109 in pytest_runtest_protocol
File "/home/hilsenst/miniconda3/envs/napari_latest/lib/python3.9/site-packages/pluggy/callers.py", line 187 in _multicall
File "/home/hilsenst/miniconda3/envs/napari_latest/lib/python3.9/site-packages/pluggy/manager.py", line 84 in <lambda>
File "/home/hilsenst/miniconda3/envs/napari_latest/lib/python3.9/site-packages/pluggy/manager.py", line 93 in _hookexec
File "/home/hilsenst/miniconda3/envs/napari_latest/lib/python3.9/site-packages/pluggy/hooks.py", line 286 in __call__
File "/home/hilsenst/miniconda3/envs/napari_latest/lib/python3.9/site-packages/_pytest/main.py", line 348 in pytest_runtestloop
File "/home/hilsenst/miniconda3/envs/napari_latest/lib/python3.9/site-packages/pluggy/callers.py", line 187 in _multicall
File "/home/hilsenst/miniconda3/envs/napari_latest/lib/python3.9/site-packages/pluggy/manager.py", line 84 in <lambda>
File "/home/hilsenst/miniconda3/envs/napari_latest/lib/python3.9/site-packages/pluggy/manager.py", line 93 in _hookexec
File "/home/hilsenst/miniconda3/envs/napari_latest/lib/python3.9/site-packages/pluggy/hooks.py", line 286 in __call__
File "/home/hilsenst/miniconda3/envs/napari_latest/lib/python3.9/site-packages/_pytest/main.py", line 323 in _main
File "/home/hilsenst/miniconda3/envs/napari_latest/lib/python3.9/site-packages/_pytest/main.py", line 269 in wrap_session
File "/home/hilsenst/miniconda3/envs/napari_latest/lib/python3.9/site-packages/_pytest/main.py", line 316 in pytest_cmdline_main
File "/home/hilsenst/miniconda3/envs/napari_latest/lib/python3.9/site-packages/pluggy/callers.py", line 187 in _multicall
File "/home/hilsenst/miniconda3/envs/napari_latest/lib/python3.9/site-packages/pluggy/manager.py", line 84 in <lambda>
File "/home/hilsenst/miniconda3/envs/napari_latest/lib/python3.9/site-packages/pluggy/manager.py", line 93 in _hookexec
File "/home/hilsenst/miniconda3/envs/napari_latest/lib/python3.9/site-packages/pluggy/hooks.py", line 286 in __call__
File "/home/hilsenst/miniconda3/envs/napari_latest/lib/python3.9/site-packages/_pytest/config/__init__.py", line 162 in main
File "/home/hilsenst/miniconda3/envs/napari_latest/lib/python3.9/site-packages/_pytest/config/__init__.py", line 185 in console_main
File "/home/hilsenst/miniconda3/envs/napari_latest/bin/pytest", line 8 in <module>
Segmentation fault (core dumped)
thank you @VolkerH! I really appreciate the help. I think we're narrowing in on something.
I will note that I found a few edge cases where I had to specify the scheduler to avoid a segfault (but I usually was able to figure out why), so let me try your example here.
not that this is a nice solution, but perhaps try the following?
def test_nd2_from_context_mgr():
arr = _load_nd2(dataset_nd2)
np_slice = np.array(arr[0, 0, :, :].compute(scheduler='threads'))
I'm inclined to merge what we have so far, and keep working at the edge cases... that ok with you?
Now I can reproduce it with a small example
my
test_nd2.py
hmmm, this test actually passes fine for (on mac), both in pytest, as well as in ipython or as a script. you're on ubuntu yes?
Yes, Ubuntu 20.04. I initially had some trouble reproducing this error with very similar code. I'm not sure how much this is effected by some sort of external state (number of cores being used, race conditions between threads, etc)
On Fri, 12 Nov 2021 at 14:55, Talley Lambert @.***> wrote:
Now I can reproduce it with a small example
my test_nd2.py
hmmm, this test actually passes fine for (on mac). you're on ubuntu yes?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/tlambert03/nd2/pull/27#issuecomment-967135563, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAQ3R7BW3HLUBDWFMJVSEB3ULUMEPANCNFSM5H3BMQGQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
I'm inclined to merge what we have so far, and keep working at the edge cases... that ok with you?
Yes, I'm all for it. I'm working with this version already and just work around the context manager issue by leaving the ND2File object in open state. Happy to leave this for a while, I think getting to the bottom of that threading/segfault problem might be time-consuming (and probably exceeds my skills)
got the leaked file figured out! One of the key details was that the dask array wasn't being "unpickled" back to a the ResourceBackedDaskArray
... (but rather a regular dask array). Overriding the __reduce__
method let me restore it as the ResourceBacked array, wherein I could call setstate and close the underlying file_ctx. learned a thing or two about pickle in the process!
Just saw this. Great that you could track it down. Will test first thing Monday!
this includes #26 ... and modifies the get/set state accordingly to deal with the new lock object.
@VolkerH ... after #26, have a look here