Open sjordan29 opened 5 months ago
Hi @sjordan29 ,
I actually just ran into that same IndexError
today while trying to work on https://github.com/xpublish-community/xpublish/discussions/246 . From the structure of your dataset, I'm guessing it's also an error with the coordinate attributes not currently being translated correctly into an OpenDAP representation, and Xarray is getting a little cranky when trying to open it.
Any chance you could open the original Zarr source locally, and slice it down to a minimum dataset that causes that error and share a NetCDF of that?
That way I can throw it into the test suite and compare what Xarray opens directly vs what it can load over OpenDAP.
Thanks so much! I added a slice of the data and the notebook showing the error I'm getting on our fork here: https://github.com/LimnoTech/xpublish-opendap/tree/index-error/notebooks/dev_sandbox
@abkfenris, thanks for looking into this issue.
BTW, USGS just made their xpublish-builder repo public, which shows how we pull everything together to create the OpenDAP endpoints exposed at https://labs-beta.waterdata.usgs.gov/api/xpublish/catalogs. Our hope that this builder repo might be useful for you or others working on https://ioos.github.io/ioos-code-sprint/2024/topics/06-xpublish.html
Great, thank you both! I'll try to take a look later or tomorrow.
Hmm, it looks like we're (at least) generating different .dds
compared to what THREDDS will generate for this dataset, so I'm going to go back to the basics to see how to generate these.
THREDDS DDS
Dataset {
Float32 time_bnds[time = 2][tbnd = 2];
Float32 time[time = 2];
} datasets/PRISM_v2_slice.nc;
xpublish DDS
Dataset {
Float32 time[time = 2];
Float64 tbnd[tbnd = 2];
Grid {
Array:
String time_bnds[time = 2][tbnd = 2];
Maps:
Float32 time[time = 2];
Float64 tbnd[tbnd = 2];
} time_bnds;
} ds;
Ok, got a bit farther into the weeds here.
When Hyrax or THREDDS encounters a dataarray that has a non coordinate dimension, instead of returning a DAP Grid with the dimensions mapped, they instead return a DAP Array with multiple dimensions.
Currently the underlying OpenDAP library that we are using is hard coded to only respond with single dimension arrays that aren't Grids.
In #60 I've gone and added some extra encoding checks and a custom OpenDAP Array that encodes dimensions similarly to Hyrax and THREDDS. This seems to take care of IndexError
s in some cases.
I've included your PRISM dataset as test data and am running automated tests against it, to make sure we don't have a regression. Strangely with Py.test locally and in the Github Actions matrix it passes, but when tested locally with a nox matrix it fails due to IndexError
s so I'm even more confused.
It would be great if you all could try testing that PR and let me know if that does the trick.
Some related issues I found along the way:
@abkfenris, thanks so much for digging into this! I will test #60 with PRISM in our Catalog-To-Xpublish implementation and let you know how it goes.
xarray
Just took a look, and the following still gives me an IndexError
opendap_url = 'http://localhost:8000/api/xpublish/PRISM_v2/datasets/zarr-s3-osn/opendap'
xr.open_dataset(
opendap_url,
engine='netcdf4',
decode_times=False,
)
However, I can query the time_bnds
variable individually now, which did not work before:
dods_url = 'https://labs-beta.waterdata.usgs.gov/api/xpublish/PRISM_v2/datasets/zarr-s3-osn/opendap?time_bnds'
xr_dataset = xr.open_dataset(
dods_url,
decode_times=False
)
I can query every single individual coordinate and variable on its own (and together in a comma separated list) except for the tbnd
dimension, and the whole xarray will lazy load:
opendap_url = 'http://localhost:8000/api/xpublish/PRISM_v2/datasets/zarr-s3-osn/opendap?crs,ppt,tmn,tmx,time_bnds'`
data_array = xr.open_dataset(
opendap_url,
engine='netcdf4',
decode_times=False,
)
So this feels really close. It's just that opendap_url = 'http://localhost:8000/api/xpublish/PRISM_v2/datasets/zarr-s3-osn/opendap'
that still gives an IndexError
I do see the changes reflected in the dds
and das
response.
Can you try building a new environment? I found that I was sporadically getting those IndexError
s in testing when Nox resolved a different environment than I had locally or in Github Actions on this PR. I compared pip list
outputs but didn't find any version differences that jumped out at me.
Hi @abkfenris, just got to building a new environment -- apologies for the delay. Still getting the error. A couple notes:
We're building Catalog-To-Xpublish in docker, so that environment re-builds every time I build the docker image from scratch. The environment we're using can be found here: https://code.usgs.gov/wma/nhgf/xpublish/-/tree/xpublish-opendap?ref_type=heads. Attaching the pip list
from inside that docker container.
docker-pip-list.txt
Just rebuilt my local environment used when requesting against the service as well, and still encountered the same behavior as in my most recent comment. Attaching the pip list
from that local environment as well.
local-pip-list.txt
Hi @abkfenris, I've been maintaining Catalog-To-Xpublish, and we're having some issues with the
opendap
response for one of the USGS datasets we're working with. Most datasets available do work, so this seems to be anomaly to work out with this dataset.The Problem Dataset
We're having issues with
PRISM_v2
; you can view the xarray structure here: https://labs-beta.waterdata.usgs.gov/api/xpublish/PRISM_v2/datasets/zarr-s3-osn/The Problem
When try to open with xarray like so:
I get the following error:
IndexError: The indexing operation you are attempting to perform is not valid on netCDF4.Variable object. Try loading your data into memory first by calling .load().
Thoughts
Seems like this might have something to do with the
tbnd
dimension (which is not a coordinate) and/or its associated variable,time_bnds
. Note that we don't see this issue when I just requestppt
variable, which has dimensions of lat/lon/time.As I understand it,
xpublish-opendap
reformats xarray datasets usingdap_xarray
to a compatible format to leverageopendap-protocol
. I'm wondering if there's something going on with some of these translations between data structures that might be causing this problem. Do you have any ideas of what might be going on here? Happy to help rig up a solve, but your understanding of these libraries would be super helpful to help me nail down the problem.