carbonplan / maps

interactive multi-dimensional data-driven web maps
https://maps.demo.carbonplan.org
Other
214 stars 18 forks source link

Support for Kerchunk data sources #100

Open zachsa opened 1 year ago

zachsa commented 1 year ago

Hello,

Would it be feasible to support kerchunk-output as a data source?

As far as I understand, Kerchunk is a tool that provides a JSON-formatted breakdown of byte offsets of a NetCDF v4 file. To illustrate, here's an example of a Kerchunk-generated JSON file.

Though I'm not very familiar with the technical aspects of Zarr directories or Kerchunk, I can see some similarities between the two. But it doesn't look like the Kerchunked JSON would be equivalent from a client/JavaScript perspective of a Zarr directly if I were to point to it, for example:

<Map>
  <Raster
    colormap={colormap}
    clim={[-20, 30]}
    source="https://mnemosyne.somisana.ac.za/somisana/algoa-bay/5-day-forecast/202307/20230712-hourly-avg-t3.kerchunk.json"
    variable={'temperature'}
    dimensions={['depth', 'y', 'x', 'time']}
    selector={{ depth: 200, time: 120 }}
  />
</Map>

Please let me know if this is already supported, or if not, whether this would be a simple/complex task. The benefit of supporting Kerchunk output rather than Zarrs directly is that it would save us around 1TB of space per year (assuming Zarrs are of a similar size to NetCDF v4 files) as we also need to store NetCDF files.

katamartin commented 1 year ago

Thanks for the question @zachsa and apologies for the delay!

Briefly, this is not currently possible and extending to support NetCDF files seems pretty challenging.

This is because of (1) a @carbonplan/maps requirement that the data be prepared in multiscales format and (2) performant data fetching and rendering on the browser requiring access to relatively small chunk sizes (in ballpark of <10MB). It's possible to loosen (1) with some work, but (2) seems pretty insurmountable. However, we have been interested in exploring whether Kerchunk could allow us to visualize Cloud Optimized GeoTIFFs, whose pyramids should have compatibly sized chunks. For that to work, we would need to coerce the Kerchunk reference file to match our multiscales spec and use reference-spec-reader via a browser-based Zarr client (we're currently using zarr-js, where this would take some work).