Open paolap opened 2 years ago
Great, this definitely looks like something we can add! I have heard some chatter about it in Pangeo circles, but hadn't actually taken the time to look at what it is just yet.
Is this something you want to add @paolap?
I guess if no one has tried it already, I can try to have a go at using it before writing more about it
That would be great @paolap - sounds like you're already a step ahead of me in knowing what it is! But happy to help if needed.
https://fsspec.github.io/kerchunk/
kerchunk is an interesting option for cloud optimised storage of netcdf, hdf and grib data. It seems to work more as a virtual aggregation that creates a single .zarr or .json reference file that points to all the individual files as a single dataset. I think it might be also indexing the actual chunks. Compared to Zarr on its own there's no data duplication. Someone just mentioned to me this morning and I had a quick look to the documentation.