pydata / xarray

N-D labeled arrays and datasets in Python
https://xarray.dev
Apache License 2.0
3.56k stars 1.07k forks source link

Segmentation fault similar to issue 8410 #9296

Closed forestbat closed 2 weeks ago

forestbat commented 1 month ago

What is your issue?

What happened?

I want to read netCDF dataset from my local filesystem, but it crashed with a Segmentation fault. This problem only occurs in local filesystem rather than s3 file system such as minio.

What did you expect to happen?

I want to get data from netCDF file(s).

Minimal Complete Verifiable Example

import xarray as xr

# obj (str): The file path or URL of the data
obj = /ftproot/test_dataset.nc
ext_name = obj.split('.')[-1]
if (ext_name == 'nc') or (ext_name == 'nc4') or ('nc4' in obj):
    data_obj = xr.open_dataset(obj, chunks='auto')

MCVE confimation

√ Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray. √ Complete example — the example is self-contained, including all data and the text of any traceback. √ Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result. √ New issue — a search of GitHub Issues suggests this is not a duplicate. √ Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Relevant log output

 Fatal Python error: Segmentation fault

Thread 0x00007f4ff8daf700 (most recent call first):
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/concurrent/futures/thread.py", line 81 in _worker
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/threading.py", line 982 in run
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/threading.py", line 1045 in _bootstrap_inner
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/threading.py", line 1002 in _bootstrap
  File "/home/wangyang1/.local/share/JetBrains/IntelliJIdea2024.1/python/helpers/pydev/_pydev_bundle/pydev_monkey.py", line 817 in __call__

Thread 0x00007f505a78f700 (most recent call first):
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/selectors.py", line 468 in select
  File "/home/wangyang1/.local/share/JetBrains/IntelliJIdea2024.1/python/helpers-pro/pydevd_asyncio/pydevd_nest_asyncio.py", line 263 in _run_once
  File "/home/wangyang1/.local/share/JetBrains/IntelliJIdea2024.1/python/helpers-pro/pydevd_asyncio/pydevd_nest_asyncio.py", line 218 in run_forever
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/threading.py", line 982 in run
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/threading.py", line 1045 in _bootstrap_inner
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/threading.py", line 1002 in _bootstrap
  File "/home/wangyang1/.local/share/JetBrains/IntelliJIdea2024.1/python/helpers/pydev/_pydev_bundle/pydev_monkey.py", line 817 in __call__

Thread 0x00007f5157fff700 (most recent call first):
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/threading.py", line 331 in wait
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/threading.py", line 629 in wait
  File "/home/wangyang1/.local/share/JetBrains/IntelliJIdea2024.1/python/helpers/pydev/pydevd.py", line 157 in _on_run
  File "/home/wangyang1/.local/share/JetBrains/IntelliJIdea2024.1/python/helpers/pydev/_pydevd_bundle/pydevd_comm.py", line 219 in run
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/threading.py", line 1045 in _bootstrap_inner
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/threading.py", line 1002 in _bootstrap

Thread 0x00007f515c818700 (most recent call first):
  File "/home/wangyang1/.local/share/JetBrains/IntelliJIdea2024.1/python/helpers/pydev/_pydevd_bundle/pydevd_comm.py", line 293 in _on_run
  File "/home/wangyang1/.local/share/JetBrains/IntelliJIdea2024.1/python/helpers/pydev/_pydevd_bundle/pydevd_comm.py", line 219 in run
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/threading.py", line 1045 in _bootstrap_inner
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/threading.py", line 1002 in _bootstrap

Thread 0x00007f515d019700 (most recent call first):
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/threading.py", line 331 in wait
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/queue.py", line 180 in get
  File "/home/wangyang1/.local/share/JetBrains/IntelliJIdea2024.1/python/helpers/pydev/_pydevd_bundle/pydevd_comm.py", line 370 in _on_run
  File "/home/wangyang1/.local/share/JetBrains/IntelliJIdea2024.1/python/helpers/pydev/_pydevd_bundle/pydevd_comm.py", line 219 in run
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/threading.py", line 1045 in _bootstrap_inner
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/threading.py", line 1002 in _bootstrap

Current thread 0x00007f516fb65740 (most recent call first):
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/site-packages/xarray/backends/file_manager.py", line 217 in _acquire_with_cache_info
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/site-packages/xarray/backends/file_manager.py", line 199 in acquire_context
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/contextlib.py", line 137 in __enter__
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/site-packages/xarray/backends/netCDF4_.py", line 411 in _acquire
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/site-packages/xarray/backends/netCDF4_.py", line 417 in ds
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/site-packages/xarray/backends/netCDF4_.py", line 355 in __init__
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/site-packages/xarray/backends/netCDF4_.py", line 408 in open
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/site-packages/xarray/backends/netCDF4_.py", line 645 in open_dataset
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/site-packages/xarray/backends/api.py", line 571 in open_dataset
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/site-packages/hydrodatasource/reader/access_fs.py", line 92 in read_valid_data
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/site-packages/hydrodatasource/reader/access_fs.py", line 20 in spec_path
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/site-packages/hydrodatasource/reader/data_source.py", line 438 in read_MP
  File "/home/wangyang1/torchhydro/torchhydro/datasets/data_scalers.py", line 293 in mean_prcp
  File "/home/wangyang1/torchhydro/torchhydro/datasets/data_scalers.py", line 414 in get_data_obs
  File "/home/wangyang1/torchhydro/torchhydro/datasets/data_scalers.py", line 487 in load_data
  File "/home/wangyang1/torchhydro/torchhydro/datasets/data_scalers.py", line 104 in __init__
  File "/home/wangyang1/torchhydro/torchhydro/datasets/data_sets.py", line 279 in _normalize
  File "/home/wangyang1/torchhydro/torchhydro/datasets/data_sets.py", line 620 in _normalize
  File "/home/wangyang1/torchhydro/torchhydro/datasets/data_sets.py", line 261 in _load_data
  File "/home/wangyang1/torchhydro/torchhydro/datasets/data_sets.py", line 119 in __init__
  File "/home/wangyang1/torchhydro/torchhydro/datasets/data_sets.py", line 613 in __init__
  File "/home/wangyang1/torchhydro/torchhydro/datasets/data_sets.py", line 678 in __init__
  File "/home/wangyang1/torchhydro/torchhydro/datasets/data_sets.py", line 762 in __init__
  File "/home/wangyang1/torchhydro/torchhydro/trainers/deep_hydro.py", line 227 in make_dataset
  File "/home/wangyang1/torchhydro/torchhydro/trainers/deep_hydro.py", line 152 in __init__
  File "/home/wangyang1/torchhydro/torchhydro/trainers/deep_hydro.py", line 782 in __init__
  File "/home/wangyang1/torchhydro/torchhydro/trainers/trainer.py", line 80 in _get_deep_hydro
  File "/home/wangyang1/torchhydro/torchhydro/trainers/trainer.py", line 63 in train_and_evaluate
  File "/home/wangyang1/torchhydro/experiments/train_with_era5land_gnn.py", line 54 in test_run_model
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/site-packages/_pytest/python.py", line 159 in pytest_pyfunc_call
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/site-packages/pluggy/_callers.py", line 103 in _multicall
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/site-packages/pluggy/_manager.py", line 120 in _hookexec
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/site-packages/pluggy/_hooks.py", line 513 in __call__
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/site-packages/_pytest/python.py", line 1627 in runtest
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/site-packages/_pytest/runner.py", line 174 in pytest_runtest_call
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/site-packages/pluggy/_callers.py", line 103 in _multicall
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/site-packages/pluggy/_manager.py", line 120 in _hookexec
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/site-packages/pluggy/_hooks.py", line 513 in __call__
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/site-packages/_pytest/runner.py", line 242 in <lambda>
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/site-packages/_pytest/runner.py", line 341 in from_call
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/site-packages/_pytest/runner.py", line 241 in call_and_report
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/site-packages/_pytest/runner.py", line 132 in runtestprotocol
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/site-packages/_pytest/runner.py", line 113 in pytest_runtest_protocol
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/site-packages/pluggy/_callers.py", line 103 in _multicall
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/site-packages/pluggy/_manager.py", line 120 in _hookexec
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/site-packages/pluggy/_hooks.py", line 513 in __call__
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/site-packages/_pytest/main.py", line 362 in pytest_runtestloop
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/site-packages/pluggy/_callers.py", line 103 in _multicall
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/site-packages/pluggy/_manager.py", line 120 in _hookexec
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/site-packages/pluggy/_hooks.py", line 513 in __call__
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/site-packages/_pytest/main.py", line 337 in _main
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/site-packages/_pytest/main.py", line 283 in wrap_session
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/site-packages/_pytest/main.py", line 330 in pytest_cmdline_main
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/site-packages/pluggy/_callers.py", line 103 in _multicall
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/site-packages/pluggy/_manager.py", line 120 in _hookexec
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/site-packages/pluggy/_hooks.py", line 513 in __call__
  File "/home/jiaxuwu/.conda/envs/forest-minio/lib/python3.11/site-packages/_pytest/config/__init__.py", line 175 in main
  File "/home/wangyang1/.local/share/JetBrains/IntelliJIdea2024.1/python/helpers/pycharm/_jb_pytest_runner.py", line 75 in <module>
  File "/home/wangyang1/.local/share/JetBrains/IntelliJIdea2024.1/python/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18 in execfile
  File "/home/wangyang1/.local/share/JetBrains/IntelliJIdea2024.1/python/helpers/pydev/pydevd.py", line 1546 in _exec
  File "/home/wangyang1/.local/share/JetBrains/IntelliJIdea2024.1/python/helpers/pydev/pydevd.py", line 1539 in run
  File "/home/wangyang1/.local/share/JetBrains/IntelliJIdea2024.1/python/helpers/pydev/pydevd.py", line 2229 in main
  File "/home/wangyang1/.local/share/JetBrains/IntelliJIdea2024.1/python/helpers/pydev/pydevd.py", line 2247 in <module>

进程已结束,退出代码为 139 (interrupted by signal 11:SIGSEGV)

Anything else we need to know?

I think it's similar to #8410 , however, even if I downgraded netCDF4 to 1.6.0, the problem still occurs.

Environment

python 3.11.9 xarray 2024.6.0 netcdf4 1.6.0

headtr1ck commented 1 month ago

We do not have access to your test dataset. Can you create this nc file from scratch in your example or make this file available to us?

forestbat commented 1 month ago

We do not have access to your test dataset. Can you create this nc file from scratch in your example or make this file available to us? I have put my dataset here: https://drive.google.com/file/d/151TOGBDy8KOBqQl7uzUq8r8420dRc4jU/view?usp=sharing

kmuehlbauer commented 1 month ago

@forestbat The mcve works perfectly fine for xarray=2024.6 and netcdf4 1.7.1 on my machine. From looking at your traceback you are running this from within PyCharm. Does this have the same effect if you run this via plain Python interpreter?

max-sixty commented 2 weeks ago

As part of keeping our issue count <1000, closing as no MCVE, please reopen if anyone disagrees