csiro-coasts / emsarray

xarray extension that supports EMS model formats
BSD 3-Clause "New" or "Revised" License
13 stars 2 forks source link

Investigate segfault when using latest dependencies from PyPI #139

Open mx-moth opened 4 months ago

mx-moth commented 4 months ago

When all dependencies are updated to their latest versions and installed via PyPI (#137) the test suite will regularly - but non-deterministically - segfault. If all the latest dependencies are installed via conda no segfault has been observed.

As a temporary work around, disabling dask multithreading seems to stop the segfaults. This is not an acceptable solution long term but will suffice to unblock other development work.

This ticket tracks the investigation so far.


To stop the segfaults, dask can be set to single threaded mode by running:

import dask
dask.config.set(scheduler='synchronous')

This is now enabled by default for test runs. To trigger the failures again, run the tests with pytest --dask-scheduler=threads


To set up a test environment clone this repository, make a conda environment, and install the dependencies from PyPI as follows:

$ git clone https://github.com/csiro-coasts/emsarray.git
$ cd emsarray
$ conda env create --name emsarray-tests --no-default-packages --file ./continuous-integration/environment.yaml
$ conda activate emsarray-tests
$ conda install python=3.12 pip
$ pip install -e .[testing]
$ pytest -vv --dask-scheduler threads

The tests segfault regularly on two specific tests which subset UGrid datasets, however other subsetting tests have also failed. Python 3.10, 3.11, and 3.12 all exhibit this issue. These tests previously worked fine. The stack traces printed vary, but some examples follow:

tests/conventions/test_ugrid.py::test_make_and_apply_clip_mask ``` $ pytest -vv --dask-scheduler threads ============ test session starts ============ platform linux -- Python 3.11.9, pytest-8.2.2, pluggy-1.5.0 -- /home/hea211/projects/emsarray/.conda/bin/python3.11 cachedir: .pytest_cache Matplotlib: 3.9.0 Freetype: 2.6.1 rootdir: /home/hea211/projects/emsarray configfile: pyproject.toml testpaths: tests plugins: mpl-0.17.0, cov-5.0.0 collected 365 items ... tests/conventions/test_ugrid.py::test_make_and_apply_clip_mask Fatal Python error: Segmentation fault Thread 0x00007f8c195fa700 (most recent call first): File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/backends/netCDF4_.py", line 113 in _getitem File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/core/indexing.py", line 1014 in explicit_indexing_adapter File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/backends/netCDF4_.py", line 100 in __getitem__ File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/core/indexing.py", line 650 in get_duck_array File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/core/indexing.py", line 787 in get_duck_array File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/core/indexing.py", line 576 in get_duck_array File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/core/indexing.py", line 573 in __array__ File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/array/core.py", line 118 in getter File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/core.py", line 127 in _execute_task File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/local.py", line 225 in execute_task File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/local.py", line 239 in File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/local.py", line 239 in batch_execute_tasks File "/home/hea211/projects/emsarray/.conda/lib/python3.11/concurrent/futures/thread.py", line 58 in run File "/home/hea211/projects/emsarray/.conda/lib/python3.11/concurrent/futures/thread.py", line 83 in _worker File "/home/hea211/projects/emsarray/.conda/lib/python3.11/threading.py", line 982 in run File "/home/hea211/projects/emsarray/.conda/lib/python3.11/threading.py", line 1045 in _bootstrap_inner File "/home/hea211/projects/emsarray/.conda/lib/python3.11/threading.py", line 1002 in _bootstrap Thread 0x00007f8c19dfb700 (most recent call first): File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/backends/netCDF4_.py", line 113 in _getitem File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/core/indexing.py", line 1014 in explicit_indexing_adapter File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/backends/netCDF4_.py", line 100 in __getitem__ File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/core/indexing.py", line 650 in get_duck_array File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/core/indexing.py", line 787 in get_duck_array File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/core/indexing.py", line 576 in get_duck_array File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/core/indexing.py", line 573 in __array__ File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/array/core.py", line 118 in getter File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/core.py", line 127 in _execute_task File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/local.py", line 225 in execute_task File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/local.py", line 239 in File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/local.py", line 239 in batch_execute_tasks File "/home/hea211/projects/emsarray/.conda/lib/python3.11/concurrent/futures/thread.py", line 58 in run File "/home/hea211/projects/emsarray/.conda/lib/python3.11/concurrent/futures/thread.py", line 83 in _worker File "/home/hea211/projects/emsarray/.conda/lib/python3.11/threading.py", line 982 in run File "/home/hea211/projects/emsarray/.conda/lib/python3.11/threading.py", line 1045 in _bootstrap_inner File "/home/hea211/projects/emsarray/.conda/lib/python3.11/threading.py", line 1002 in _bootstrap Thread 0x00007f8c1a5fc700 (most recent call first): File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/backends/netCDF4_.py", line 113 in _getitem File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/core/indexing.py", line 1014 in explicit_indexing_adapter File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/backends/netCDF4_.py", line 100 in __getitem__ File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/core/indexing.py", line 650 in get_duck_array File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/core/indexing.py", line 787 in get_duck_array File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/core/indexing.py", line 576 in get_duck_array File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/core/indexing.py", line 573 in __array__ File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/array/core.py", line 118 in getter File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/core.py", line 127 in _execute_task File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/local.py", line 225 in execute_task File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/local.py", line 239 in File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/local.py", line 239 in batch_execute_tasks File "/home/hea211/projects/emsarray/.conda/lib/python3.11/concurrent/futures/thread.py", line 58 in run File "/home/hea211/projects/emsarray/.conda/lib/python3.11/concurrent/futures/thread.py", line 83 in _worker File "/home/hea211/projects/emsarray/.conda/lib/python3.11/threading.py", line 982 in run File "/home/hea211/projects/emsarray/.conda/lib/python3.11/threading.py", line 1045 in _bootstrap_inner File "/home/hea211/projects/emsarray/.conda/lib/python3.11/threading.py", line 1002 in _bootstrap Thread 0x00007f8c1adfd700 (most recent call first): File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/backends/netCDF4_.py", line 113 in _getitem File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/core/indexing.py", line 1014 in explicit_indexing_adapter File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/backends/netCDF4_.py", line 100 in __getitem__ File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/core/indexing.py", line 650 in get_duck_array File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/coding/variables.py", line 81 in get_duck_array File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/core/indexing.py", line 657 in get_duck_array File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/core/indexing.py", line 787 in get_duck_array File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/core/indexing.py", line 576 in get_duck_array File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/core/indexing.py", line 573 in __array__ File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/array/core.py", line 118 in getter File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/core.py", line 127 in _execute_task File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/core.py", line 157 in get File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/optimization.py", line 1001 in __call__ File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/core.py", line 127 in _execute_task File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/core.py", line 127 in File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/core.py", line 127 in _execute_task File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/local.py", line 225 in execute_task File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/local.py", line 239 in File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/local.py", line 239 in batch_execute_tasks File "/home/hea211/projects/emsarray/.conda/lib/python3.11/concurrent/futures/thread.py", line 58 in run File "/home/hea211/projects/emsarray/.conda/lib/python3.11/concurrent/futures/thread.py", line 83 in _worker File "/home/hea211/projects/emsarray/.conda/lib/python3.11/threading.py", line 982 in run File "/home/hea211/projects/emsarray/.conda/lib/python3.11/threading.py", line 1045 in _bootstrap_inner File "/home/hea211/projects/emsarray/.conda/lib/python3.11/threading.py", line 1002 in _bootstrap Thread 0x00007f8c1b7fe700 (most recent call first): File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/backends/netCDF4_.py", line 113 in _getitem File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/core/indexing.py", line 1014 in explicit_indexing_adapter File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/backends/netCDF4_.py", line 100 in __getitem__ File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/core/indexing.py", line 650 in get_duck_array File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/core/indexing.py", line 787 in get_duck_array File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/core/indexing.py", line 576 in get_duck_array File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/core/indexing.py", line 573 in __array__ File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/array/core.py", line 118 in getter File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/core.py", line 127 in _execute_task File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/local.py", line 225 in execute_task File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/local.py", line 239 in File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/local.py", line 239 in batch_execute_tasks File "/home/hea211/projects/emsarray/.conda/lib/python3.11/concurrent/futures/thread.py", line 58 in run File "/home/hea211/projects/emsarray/.conda/lib/python3.11/concurrent/futures/thread.py", line 83 in _worker File "/home/hea211/projects/emsarray/.conda/lib/python3.11/threading.py", line 982 in run File "/home/hea211/projects/emsarray/.conda/lib/python3.11/threading.py", line 1045 in _bootstrap_inner File "/home/hea211/projects/emsarray/.conda/lib/python3.11/threading.py", line 1002 in _bootstrap Thread 0x00007f8c1bfff700 (most recent call first): File "", line 330 in _handle_fromlist File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/pandas/core/arrays/datetimelike.py", line 1176 in _sub_datetimelike File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/pandas/core/arrays/datetimelike.py"Segmentation fault (core dumped) ```
tests/conventions/test_ugrid.py::test_make_and_apply_clip_mask ``` $ pytest -vv --dask-scheduler threads ============ test session starts ============ platform linux -- Python 3.11.9, pytest-8.2.2, pluggy-1.5.0 -- /home/hea211/projects/emsarray/.conda/bin/python3.11 cachedir: .pytest_cache Matplotlib: 3.9.0 Freetype: 2.6.1 rootdir: /home/hea211/projects/emsarray configfile: pyproject.toml testpaths: tests plugins: mpl-0.17.0, cov-5.0.0 collected 365 items ... tests/conventions/test_ugrid.py::test_make_and_apply_clip_mask Fatal Python error: Segmentation fault Thread 0x00007efd0f7fe700 (most recent call first): File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/backends/locks.py", line 64 in __enter__ File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/backends/locks.py", line 231 in __enter__ File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/backends/netCDF4_.py", line 77 in __setitem__ File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/array/core.py", line 4380 in load_store_chunk File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/array/core.py", line 4398 in store_chunk File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/core.py", line 127 in _execute_task File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/local.py", line 225 in execute_task File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/local.py", line 239 in File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/local.py", line 239 in batch_execute_tasks File "/home/hea211/projects/emsarray/.conda/lib/python3.11/concurrent/futures/thread.py", line 58 in run File "/home/hea211/projects/emsarray/.conda/lib/python3.11/concurrent/futures/thread.py", line 83 in _worker File "/home/hea211/projects/emsarray/.conda/lib/python3.11/threading.py", line 982 in run File "/home/hea211/projects/emsarray/.conda/lib/python3.11/threading.py", line 1045 in _bootstrap_inner File "/home/hea211/projects/emsarray/.conda/lib/python3.11/threading.py", line 1002 in _bootstrap Thread 0x00007efd0ffff700 (most recent call first): File "/home/hea211/projects/emsarray/.conda/lib/python3.11/concurrent/futures/thread.py", line 81 in _worker File "/home/hea211/projects/emsarray/.conda/lib/python3.11/threading.py", line 982 in run File "/home/hea211/projects/emsarray/.conda/lib/python3.11/threading.py", line 1045 in _bootstrap_inner File "/home/hea211/projects/emsarray/.conda/lib/python3.11/threading.py", line 1002 in _bootstrap Thread 0x00007efd2498a700 (most recent call first): File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/backends/locks.py", line 64 in __enter__ File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/backends/locks.py", line 231 in __enter__ File Segmentation fault (core dumped) ```
tests/cli/commands/test_clip.py::test_clip ``` $ pytest -vv --dask-scheduler threads -- tests/cli/commands/test_clip.py::test_clip ============ test session starts ============ platform linux -- Python 3.11.9, pytest-8.2.2, pluggy-1.5.0 -- /home/hea211/projects/emsarray/.conda/bin/python3.11 cachedir: .pytest_cache Matplotlib: 3.9.0 Freetype: 2.6.1 rootdir: /home/hea211/projects/emsarray configfile: pyproject.toml plugins: mpl-0.17.0, cov-5.0.0 collected 1 item tests/cli/commands/test_clip.py::test_clip Fatal Python error: Fatal Python error: Segmentation faultSegmentation fault Current thread 0x00007fb280fa0700 (most recent call first): File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/backends/netCDF4_.py", line 113 in _getitem File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/core/indexing.py", line 1014 in explicit_indexing_adapter File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/backends/netCDF4_.py", line 100 in __getitem__ File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/core/indexing.py", line 650 in get_duck_array File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/core/indexing.py", line 787 in get_duck_array File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/core/indexing.py", line 576 in get_duck_array File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/core/indexing.py", line 573 in __array__ File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/array/core.py", line 118 in getter File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/core.py", line 127 in _execute_task File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/local.py", line 225 in execute_task File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/local.py", line 239 in File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/local.py", line 239 in batch_execute_tasks File "/home/hea211/projects/emsarray/.conda/lib/python3.11/concurrent/futures/thread.py", line 58 in run File "/home/hea211/projects/emsarray/.conda/lib/python3.11/concurrent/futures/thread.py", line 83 in _worker File "/home/hea211/projects/emsarray/.conda/lib/python3.11/threading.py", line 982 in run File "/home/hea211/projects/emsarray/.conda/lib/python3.11/threading.py", line 1045 in _bootstrap_inner File "/home/hea211/projects/emsarray/.conda/lib/python3.11/threading.py", line 1002 in _bootstrap Thread 0x00007fb2818a1700 (most recent call first): File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/backends/netCDF4_.py", line 79 in __setitem__ File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/array/core.py", line 4380 in load_store_chunk File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/array/core.py", line 4398 in store_chunk File Extension modules: "markupsafe._speedups/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/core.py", line 127 in _execute_task File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/local.py", line 225 in execute_task File , "numpy._core._multiarray_umath/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/local.py", line 239 in File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/local.py", line 239, in numpy._core._multiarray_testsbatch_execute_tasks File "/home/hea211/projects/emsarray/.conda/lib/python3.11/concurrent/futures/thread.py", line 58 in run File "/home/hea211/projects/emsarray/.conda/lib/python3.11/concurrent/futures/thread.py, "numpy.linalg._umath_linalg, line 83 in _worker File "/home/hea211/projects/emsarray/.conda/lib/python3.11/threading.py", line 982, in shapely.librun File "/home/hea211/projects/emsarray/.conda/lib/python3.11/threading.py", , line shapely._geos1045 in _bootstrap_inner File ", /home/hea211/projects/emsarray/.conda/lib/python3.11/threading.pyshapely._geometry_helpers", line 1002 in _bootstrap Thread 0x00007fb2a9498740 (most recent call first): File "/home/hea211/projects/emsarray/.conda/lib/python3.11/threading.py", , line yaml._yaml327 in wait File "/home/hea211/projects/emsarray/.conda/lib/python3.11/threading.py", line 629 in wait File "/home/hea211/projects/emsarray/.conda/lib/python3.11/threading.py, "psutil._psutil_linux, line 969 in start, psutil._psutil_posix File "/home/hea211/projects/emsarray/.conda/lib/python3.11/concurrent/futures/thread.py", line 199 in _adjust_thread_count, pyarrow.lib File "/home/hea211/projects/emsarray/.conda/lib/python3.11/concurrent/futures/thread.py", , line numpy.random._common176 in submit File ", /home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/local.pynumpy.random.bit_generator", line 495 in , fire_tasksnumpy.random._bounded_integers File ", /home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/local.pynumpy.random._mt19937", line , 500numpy.random.mtrand in get_async File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/threaded.py", , line numpy.random._philox90 in get , File numpy.random._pcg64"/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/base.py", line 403 in , compute_as_if_collectionnumpy.random._sfc64 File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/dask/array/core.py", numpy.random._generator, line 1229 in , storepandas._libs.tslibs.ccalendar File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/namedarray/daskmanager.py", line , 249pandas._libs.tslibs.np_datetime in store File ", /home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/backends/common.pypandas._libs.tslibs.dtypes", line 267 in sync , File pandas._libs.tslibs.base"/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/backends/api.py", line , 1346pandas._libs.tslibs.nattype in to_netcdf File "/home/hea211/projects/emsarray/.conda/lib/python3.11/site-packages/xarray/core/dataset.py", line 2327 in , to_netcdfpandas._libs.tslibs.timezones File "/home/hea211/projects/emsarray/src/emsarray/utils.py", line , 149pandas._libs.tslibs.fields in to_netcdf_with_fixes File "/home/hea211/projects/emsarray/src/emsarray/conventions/_base.py, "pandas._libs.tslibs.timedeltas, line 1695 in to_netcdf, pandas._libs.tslibs.tzconversion File "/home/hea211/projects/emsarray/src/emsarray/cli/commands/clip.py", , line pandas._libs.tslibs.timestamps53 in handle File "/home/hea211/projects/emsarray/src/emsarray/cli/__init__.py", , line pandas._libs.properties, pandas._libs.tslibs.offsetsSegmentation fault (core dumped) ```
sharon-tickell commented 4 months ago

In the test_clip example in your post, @mx-moth , it looks like the netcdf python library is definitely in the mix. That uses the netcdf-c library and hdf5 C library under the hood, and those can optionally be compiled with support for parallel processing via Open MPI. It might be worth ensuring that your netcdf4 python library links against the latest versions of those libraries, and whether those being compiled with --parallel support or not makes any difference. (The ereefs/netcdf-base docker image is set up to let you select library versions and compilation options, and may help provide a test environment)

mx-moth commented 4 months ago

Further evidence towards it being an issue with the netCDF4 library on PyPI is that the errors go away if I downgrade to netCDF4 < 1.7. Unfortunately netCDF4 ~= 1.6.x is not compatible with numpy, so that also needs downgrading. Setting up the environment as follows will not segfault:

$ conda env create --name emsarray-tests --no-default-packages --file ./continuous-integration/environment.yaml
$ conda activate emsarray-tests
$ conda install python=3.12 pip
$ pip install -e .[testing] 'netcdf4<1.7' 'numpy<2'
$ pytest -vv
sharon-tickell commented 4 months ago

@mx-moth as a point of interest, I just had a go at reproducing this on the ereefs/netcdf-base image using NetCDF libraries compiled with OpenMPI support and could NOT reproduce it!

This environment has:

Preparation steps:

docker pull onaci/ereefs-netcdf-base:python-3.11-slim-bookworm
docker run --rm -i -t onaci/ereefs-netcdf-base:python-3.11-slim-bookworm bash

Then from a shell inside the container:

git clone git clone https://github.com/csiro-coasts/emsarray.git
cd emsarray
git checkout dependency-version-bump

# Ensure the python netcdf4 library compiles its own wheel against the netcdf-c library
# version which is already installed into the base image:
# Note: before running this step, I needed to edit continuous-integration/requirements-3.11.txt
# so that any requirement with extras (like coverage[toml]==7.5.4) no longer had the [] part:
# This is because of https://github.com/pypa/pip/issues/8210 and the newest version of pip!
pip3-netcdf-install continuous-integration/requirements-3.11.txt

# Then edit the requirements-3.11.txt file again to put the extras back...
# And install all the other requirements:
pip3 install -r continuous-integration/requirements-3.11.txt
pip install -e .[testing]

# Run the tests
pytest -vv

All the tests passed without error or segfault.

mx-moth commented 4 months ago

Installing netCDF4 1.7.1 and numpy 2.0.0 from conda, and installing the rest of the dependencies from pip also does not segfault. Something in the netCDF4 1.7.1 wheel from PyPI is seeming more likely

$ conda env create --name emsarray-tests --no-default-packages --file ./continuous-integration/environment.yaml
$ conda activate emsarray-tests
$ conda install python=3.12 'netcdf4=1.7.1' 'numpy=2.0'
$ pip install -e .[testing]
$ pytest -vv
mx-moth commented 4 months ago

I disabled dask multithreading in the tests in #137. To reenable multithreading when running pytest, run it as pytest --dask-scheduler threads