Closed daxsoule closed 4 years ago
Does it work if you opens the url directly with netcdf4, bypassing xarray? If not, then it’s a netcdf4 or thredds problem.
On Jan 1, 2020, at 10:44 PM, Dax Soule notifications@github.com wrote:
Reopened #3653.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.
This isn't accessing netCDF using opendap, it's directly accessing using HTTP. Did xarray or netCDF gain support for this and I failed to notice?
Thanks for the responses. I think I understand this issue now. I am not an expert, but Xarray does not appear to be able to open netCDF from a url without a THREADS server. In this case
import xarray as xr
url = https://opendap.oceanobservatories.org/thredds/dodsC/ooi/dax.soule@qc.cuny.edu/20191228T052214778Z-RS03ECAL-MJ03E-06-BOTPTA302-streamed-botpt_nano_sample/deployment0001_RS03ECAL-MJ03E-06-BOTPTA302-streamed-botpt_nano_sample_20191005T000000-20191005T235959.950000.nc
ds = xr.open_dataset(url)
Works just fine. So I see how to make xarray work using OpenDAP, but I am still unable to open a file directly from object store using the HTTPS URL.
I am not sure that this is supported, but if it is I would love to see an example.
Xarray does not appear to be able to open netCDF from a url without a THREADS serve
This is not a limitation of xarray per se, but of the netcdf4-python library itself.
You can, however, get around this limitation by using fsspec.
import xarray as xr
from fsspec.implementations.http import HTTPFileSystem
# your Azure URL is not publicly accessible, so I used another one
url = 'https://www.ldeo.columbia.edu/~rpa/NOAA_NCDC_ERSST_v3b_SST.nc'
fs = HTTPFileSystem()
fobj = fs.open(url)
ds = xr.open_dataset(fobj)
ds
This will not be very performant, but it will work.
More concise syntax for the same thing
import xarray as xr
import fsspec
url = 'https://www.ldeo.columbia.edu/~rpa/NOAA_NCDC_ERSST_v3b_SST.nc'
with fsspec.open(url) as fobj:
ds = xr.open_dataset(fobj)
print(ds)
Update: there is now a way to read a remote netCDF file from an HTTP server directly using the netcdf-python library. The trick is to append #mode=bytes
to the end of the url.
import xarray as xr
import netCDF4 # I'm using version 1.5.6
url = "https://www.ldeo.columbia.edu/~rpa/NOAA_NCDC_ERSST_v3b_SST.nc#mode=bytes"
# raw netcdf4 Dataset
ds = netCDF4.Dataset(url)
# xarray Dataset
ds = xr.open_dataset(url)
import fsspec url = 'https://www.ldeo.columbia.edu/~rpa/NOAA_NCDC_ERSST_v3b_SST.nc' with fsspec.open(url) as fobj: ds = xr.open_dataset(fobj) print(ds)
If you load the dataset, with:
ds = xr.open_dataset(fobj).load()
that should solve the problem (it did for me)
I am currently trying to read netCDF files directly from a THREADS server or from Azure blob storage using xarray.open_dataset using a url. I have tested NetCDFs from several sources and the files themselves appear to be fine. xarray_open_dataset works if I download them to my local environment using either wget.download or just moving the files manually.
MCVE Code Sample
I have also tested it with these additional urls:
Expected Output - In all cases, the expected output is the xarray dataset:
Problem Description
As far as I can tell, permissions are not the issue and neither are the files because the url's will download the files and xarray will open the files that I download. The only thing I am not able to do is open the files with xarray directly from the server. Any help or suggestions are greatly appreciated.
Output of
xr.open_dataset(url)
cc @tjcrone