Closed eni-awowale closed 3 months ago
Thanks for opening your first issue here at xarray! Be sure to follow the issue template! If you have an idea for a solution, we would really welcome a Pull Request with proposed changes. See the Contributing Guide for more. It may take us a while to respond here, but we really value your contribution. Contributors like you help make xarray better. Thank you!
That is absolutely gnarly, and surprising that you can use xr.open_dataset
just fine seeing as the current implementation of open_datatree
just calls xr.open_dataset(group=...)
repeatedly.
Does this bug still happen if you check out the re-implemented version of open_datatree
in https://github.com/pydata/xarray/pull/9014?
EDIT: I also wonder if perhaps this could be reproduced by calling xr.open_dataset
on the same group many times? I'm struggling to see why the datatree component of this would be necessary to reproduce it.
Hey Tom,
I actually just replicated this in our docker image with netCDF4.Dataset()
by specifying group='Sequences'
, but this only happened once and I'm unable to replicate again.
The docker image uses the python 3.12 base image. Eni and I both have M1 Macs so we use the linux/arm64
build of this image. I'm going to try replicating this on the linux/amd64
base image and see what happens.
On my own machine (M1 Mac) with netcdf-c 4.8.2
, I do indeed replicate the segmentation fault with booth repeatedly looping xr.open_dataset()
and netCDF4.Dataset()
. But this happens regardless of whether I choose the group with the string-type variables in it (Sequences
) or another group that doesn't (Science
)
Worth mentioning that we are building netcdf-c from source in our docker image - due to a persistent issue with NASA Earthdata Login requiring a specific version of netcdf-c not in the linux repo for our image. But no special options are given to the configuration of that.
I agree with Eni that this seems to be something to do with netcdf-c - likely some built-in caching that we don't have an interface for. I don't see any of these issues when reading the same dataset with the hdf5 library.
@TomNicholas I was able to replicate this issue with nc4.open_dataset()
. It failed after the third retry. I will edit the title since this is not isolated to open_datatree()
Okay thanks both - so I will close this as an upstream issue then?
Thanks Tom! Sounds good. I might open the issue directly in the netCDF4 python library.
What happened?
Hi everyone, Excited to report my first bug 🐛! I have been creating some grouped test netCDF4 files for unit testing our internal repository. I started getting segmentation faults when I added variables of a string datatype. This only happens with
engine='netCDF4'
. When you change the engine to'h5netcdf'
there are no segmentation faults. We are thinking this has something to do with the netcdf4-c library. However, I have only been able to replicate this issue withopen_datatree()
, with the engine set to the defaultnetCDF4
library. And not withnc4.Dataset()
orxr.open_dataset()
. My colleague @lsterzinger has been getting segmentation faults with all three of these methods and will elaborate on this thread.We've been able to narrow this down to a problem with data variables with a non-numerical datatype, by creating netCDF4 files with variables of a string datatype,
np.dtype('<U4')
.open_datatree()
seg faults after the fourth call (see example below). I have not been able to replicate segmentation faults for netCDF4 files without string data variables, even with a thousand calls toopen_datatree()
or with the engine set to'h5netcdf'
for datasets with string variables.To replicate this error:
In a docker container we are running the netcdf-c library version
'4.8.1
and we are building the netCDF4 python library from source. On my local machine I am running netcdf-c library version4.9.3-development
. I have been getting the segmentation faults on both machines.Data source
Here is the granule data download link from our online archive. It has non-numerical datatypes, specifically string and datetime types.
Granule tree structure:
What did you expect to happen?
I expected `open_datatree(engine='netCDF4') to return DataTree object. Instead it seg faults.
Minimal Complete Verifiable Example
MVCE confirmation
Relevant log output
Anything else we need to know?
No response
Environment