jbusecke / esgf-virtual-zarr-data-access

ESGF working group to enable data access via virtual zarrs.
Apache License 2.0
5 stars 1 forks source link

Hack Wrapup #7

Open jbusecke opened 1 month ago

jbusecke commented 1 month ago

Great day hacking on using virtualizarr to produce ref files on ESGF with @sashakames.

We successfully produced a few ref files, exposed them via HTTp, and were able to access (and compute on them) in some environments.

Bugs:

Optional:

Plan to wrap up proof of concept:

We should also wait until https://github.com/TomNicholas/VirtualiZarr/pull/126 is fully tested and merged until we produce a lot of references. I believe there is currently a bug in the PR, but it is easy enough to circumvent.

jbusecke commented 1 month ago

@sashakames I just cleaned up the code a little bit. Lets use https://github.com/jbusecke/esgf-virtual-zarr-data-access/blob/main/virtual-zarr-script.py and https://github.com/jbusecke/esgf-virtual-zarr-data-access/blob/main/requirements.txt to produce the next files.

I will add dependencies and code there to parallelize virtual data and fix the bugs above.

TomNicholas commented 1 month ago

The metadata is lost in the combination step

Do you mean the metadata stored in the xarray .attrs or something else?

Yes that is right. In my initial attempt the metadata was lost during the concat step due to the default behavior of xarray to just drop all of it. I have now set this option to only drop conflicts