microsoft / PlanetaryComputerExamples

Examples of using the Planetary Computer
MIT License
373 stars 183 forks source link

multiple reads of nasa-nex-gddp-cmip6 dataset from MultiZarrToZarr concatenated metadata returns all nans #289

Open solomon-negusse opened 9 months ago

solomon-negusse commented 9 months ago

Simply running twice cells 13 & 14 of the notebook that read and plot point single variable time-series for a point will reproduce this issue where the first run will have the valid values but second will be all nans. I encountered this when parallelizing reading of the files with dask that results in multiple reads and the unexpected result.

TomAugspurger commented 9 months ago

Thanks for the report. I won't have a chance to look into this for a while, but one note in case you want to look into it:

It looks like that's mutating a value in-place:

d["templates"][key] = d["templates"][key] + "?" + sas_token

So that cell probably isn't idempotent, and so it probably wouldn't be safe to run multiple times. I wonder what would happen if you avoid that (either by splitting the logic that updates the template, or by copy.deepcopy-ing d before running that.

solomon-negusse commented 9 months ago

Thanks for the quick response, will test that out.