Closed cgmorton closed 8 months ago
I took a look at the conus404-hourly-osn
dataset and can confirm that chunks appear to be missing - I'm not sure how many variables this has occurred with. For comparison I also looked at the conus404-hourly-onprem
dataset and it does not have those missing chunks so I tend to think the zarr dataset transfer to the OSN pod is incomplete. @amsnyder do I have access to write to the pod?
Thanks for checking on this @pnorton-usgs. I just started the copy of the CONUS404 data over to OSN.
@cgmorton - thank you for raising this. I will let you know when the data has been updated.
Yikes! This time, let's make sure we use a transfer method that does checksums (or do checksums after!) Or if that's too expensive since we have so many files in these zarr datasets, perhaps ensure the directory sizes are the same or something?
Hi @cgmorton - the hourly dataset has been updated in our intake catalog to point to a new copy of the data. Seems to include those missing chunks - can you take a look at let me know how it looks to you?
Sorry for the delay but thank you for updating this! It seems like all of the data is there and I'm not seeing any missing chunks.
Awesome! Thank you for letting us know about the missing data.
I'm noticing numerous chunks of missing/nodata in the U10 variable read from the conus404-hourly-osn zarr store. It seems like it might be an issue just with the zarr store since the missing data seem to match up with the zarr chunks and the same datetimes have complete coverage in the NCAR RDA store (https://thredds.rda.ucar.edu/thredds/catalog/files/g/ds559.0/catalog.html).
Here is a simple gist notebook that hopefully shows the issue: https://gist.github.com/cgmorton/e4a9f6a1121a1d295dcfc73cb280a580
It seems like most of the missing data are between 2000 and 2017, but there are a few dates in the early 1980s that have missing chunks. I have not done a thorough review, but I have not seen any other missing chunks in my quick check of the other variables we are reading (T2, TD2, V10, PSFC, ACSWDNB, PREC_ACC_NN).
Sorry in advance if this is a known issue or if I made some obvious/simple mistake!
Edit: Here is one of the images from the notebook link above showing the missing chunks![download](https://github.com/hytest-org/hytest/assets/9002566/99ba64c0-f85a-4cc6-980b-ec917f3fb0c2)