eurec4a / eurec4a-intake

Intake catalogue for EUREC4A field campaign datasets
17 stars 19 forks source link

Excluding further incompatible netcdf4 version #54

Closed observingClouds closed 2 years ago

observingClouds commented 3 years ago

netCDF4 version 1.5.1.2 causes issues e.g.

import xarray as xr
xr.open_dataset("https://observations.ipsl.fr/thredds/dodsC/EUREC4A/PRODUCTS/RADIATIVE-PROFILES/rad_profiles.nc")

KeyError: [<class 'netCDF4._netCDF4.Dataset'>, ('https://observations.ipsl.fr/thredds/dodsC/EUREC4A/PRODUCTS/RADIATIVE-PROFILES/rad_profiles.nc',), 'r', (('clobber', True), ('diskless', False), ('format', 'NETCDF4'), ('persist', False))]
d70-t commented 3 years ago

I tried the example on Ubuntu 20.04.1 LTS and

# pip3 freeze
cftime==1.4.1
netCDF4==1.5.1.2
numpy==1.20.1
pandas==1.2.2
python-dateutil==2.8.1
pytz==2021.1
six==1.15.0
xarray==0.17.0

as well as macOS 11.1 and

 % pip freeze                         
cftime==1.4.1
netCDF4==1.5.1.2
numpy==1.20.1
pandas==1.2.2
python-dateutil==2.8.1
pytz==2021.1
six==1.15.0
xarray==0.17.0

and could not reproduce the error. Do you know more about why you observed this behavior?

observingClouds commented 3 years ago

Okay, let me check, I get the error with

% pip freeze
cftime==1.4.1
netCDF4==1.5.1.2
numpy==1.19.5
pandas==1.1.5
python-dateutil==2.8.1
pytz==2021.1
six==1.15.0
xarray==0.16.2

and

% python --version
Python 3.6.7
d70-t commented 3 years ago

This is really strange...

I've prepared a Dockerfile + requirements: Archiv.zip. You can run it by extracting it into a folder and executing the following:

docker build -t netcdf_test .
docker run -it --rm netcdf_test

note that this will create a named docker image on your machine. If you want to get rid of it again, run

docker image rm netcdf_test

The executed script contains:

import xarray as xr
print(xr.open_dataset("https://observations.ipsl.fr/thredds/dodsC/EUREC4A/PRODUCTS/RADIATIVE-PROFILES/rad_profiles.nc"))

import os
os.system("python --version")
os.system("pip freeze")

The result is:

<xarray.Dataset>
Dimensions:            (alt: 1000, alt_edges: 1001, launch_time: 2504)
Coordinates:
  * alt                (alt) int32 5 15 25 35 45 55 ... 9955 9965 9975 9985 9995
  * alt_edges          (alt_edges) int32 0 10 20 30 40 ... 9970 9980 9990 10000
  * launch_time        (launch_time) datetime64[ns] 2020-01-19T16:55:14 ... 2...
Data variables:
    platform           (launch_time) |S64 ...
    z_min              (launch_time) float64 ...
    z_max              (launch_time) float64 ...
    sfc_emis           (launch_time) float64 ...
    sfc_alb            (launch_time) float64 ...
    sfc_temperature    (launch_time) float64 ...
    cos_sza            (launch_time) float64 ...
    latitude           (launch_time, alt) float64 ...
    longitude          (launch_time, alt) float64 ...
    temperature        (launch_time, alt) float64 ...
    pressure           (launch_time, alt) float64 ...
    specific_humidity  (launch_time, alt) float64 ...
    pressure_edges     (launch_time, alt_edges) float64 ...
    lw_dn              (launch_time, alt_edges) float64 ...
    lw_up              (launch_time, alt_edges) float64 ...
    lw_net             (launch_time, alt_edges) float64 ...
    sw_dn              (launch_time, alt_edges) float64 ...
    sw_up              (launch_time, alt_edges) float64 ...
    sw_net             (launch_time, alt_edges) float64 ...
    relative_humidity  (launch_time, alt) float64 ...
    wind_speed         (launch_time, alt) float64 ...
    wind_direction     (launch_time, alt) float64 ...
    u_wind             (launch_time, alt) float64 ...
    v_wind             (launch_time, alt) float64 ...
    co2                (launch_time, alt) float64 ...
    ch4                (launch_time, alt) float64 ...
    n2o                (launch_time, alt) float64 ...
    o3                 (launch_time, alt) float64 ...
    o2                 (launch_time, alt) float64 ...
    n2                 (launch_time, alt) float64 ...
    co                 (launch_time, alt) float64 ...
    mr                 (launch_time, alt) float64 ...
    rho                (launch_time, alt) float64 ...
    q_rad              (launch_time, alt) float64 ...
    q_rad_lw           (launch_time, alt) float64 ...
    q_rad_sw           (launch_time, alt) float64 ...
Attributes:
    doi:      https://doi.org/10.25326/78
    history:  Fri Jan  8 16:54:03 2021: ncatted -O -a doi,global,a,c,https://...
    NCO:      netCDF Operators version 4.9.1 (Homepage = http://nco.sf.net, C...
Python 3.6.7
cftime==1.4.1
netCDF4==1.5.1.2
numpy==1.19.5
pandas==1.1.5
python-dateutil==2.8.1
pytz==2021.1
six==1.15.0
xarray==0.16.2

I am hesitant to exclude additional versions if we can not point out where this issue comes from. To me it seems like netCDF4 is not the key problem here and the issue might be found at a different place.

Do you have a traceback or something which could point us to the place where the KeyError is raised? And also why is a list of netCDF datasets seemingly being used as a key into something?

RobertPincus commented 2 years ago

@d70-t @observingClouds Do we want to close this PR?