CCI-Tools / cate

ESA CCI Toolbox (Cate)
MIT License
50 stars 15 forks source link

Failing to open remote data source #892

Closed AliceBalfanz closed 4 years ago

AliceBalfanz commented 4 years ago

Expected behavior

When using the cate python api to open a remote data store from a dataset, where I know that it opens once added to local store. I would expect open_dataset to behave the same for remote and local datasets.

Actual behavior

An error is returned:

DataAccessError: Failed to open data source "esacci2.SST.day.L3C.SSTskin.AVHRR-3.NOAA-19.AVHRR19_G.2-1.r1" for given time range: [Errno -45] NetCDF: Not a valid data type or _FillValue type mismatch: b'http://cci-odp-data2.ceda.ac.uk/thredds/dodsC/esacci/sst/data/CDR_v2/AVHRR/L3C/v2.1/AVHRR19_G/2009/02/23/20090223120000-ESACCI-L3C_GHRSST-SSTskin-AVHRR19_G-CDR2.1_day-v02.0-fv01.0.nc'

Steps to reproduce the problem

  1. Importing all necessary packages

    from cate.core.ds import DATA_STORE_REGISTRY
    from cate.core import ds
  2. Getting opensearch datastore and local datastore

    data_store = DATA_STORE_REGISTRY.get_data_store('esa_cci_odp_os')
    local_store = DATA_STORE_REGISTRY.get_data_store('local')
  3. Selecting desired dataset

    data_source = data_store.query(ds_id='esacci2.SST.day.L3C.SSTskin.AVHRR-3.NOAA-19.AVHRR19_G.2-1.r1')[0]
  4. Opening the remote datasource

    
    ds_from_remote_source = ds.open_dataset(data_source, time_range=['2009-02-23', '2009-02-24'])
The follwing error is raised:

DataAccessError: Failed to open data source "esacci.SST.day.L3C.SSTskin.AVHRR-3.NOAA-19.AVHRR19_G.2-1.r1" for given time range: [Errno -45] NetCDF: Not a valid data type or _FillValue type mismatch: b'http://cci-odp-data2.ceda.ac.uk/thredds/dodsC/esacci/sst/data/CDR_v2/AVHRR/L3C/v2.1/AVHRR19_G/2009/02/23/20090223120000-ESACCI-L3C_GHRSST-SSTskin-AVHRR19_G-CDR2.1_day-v02.0-fv01.0.nc'

5. Making local first, and then opening the dataset returns expected results:

data_store.query('esacci2.SST.day.L3C.SSTskin.AVHRR-3.NOAA-19.AVHRR19_G.2-1.r1')[0].make_local('test_sst_two_days', time_range=['2009-02-23', '2009-02-24']) ds.open_dataset('local.test_sst_two_days')

TonioF commented 4 years ago

This could be solved by adding #fillmismatch to a url. This does not really seem to be the cleanest solution, but as long as there are no updated libraries of libnetcdf or netcdf4, it is the best way to go.