weiji14 / foss4g2023oceania

The ecosystem of geospatial machine learning tools in the Pangeo world.
https://hackmd.io/@weiji14/foss4g2023oceania
GNU Lesser General Public License v3.0
11 stars 1 forks source link

:busts_in_silhouette: Compare data loading between kvikIO and Zarr engine #7

Closed weiji14 closed 1 year ago

weiji14 commented 1 year ago

Getting the hard numbers on how fast the GPU-based kvikIO engine is over the CPU-based Zarr engine.

compare_kvikio_zarr

Formula for calculation:

$$ \frac{Slower - Faster}{Slower} = \frac{16.0 - 11.9}{16.0} = 0.25625 = 25.53\%$$

I.e. kvikio engine takes ~25% less time than zarr engine to load the ERA5 subset dataset.

Note that:

Preview results in 2_compare_results.ipynb notebook.

TODO:

RichardScottOZ commented 1 year ago

Have you tried this one with COGs?

weiji14 commented 1 year ago

Have you tried this one with COGs?

No, kvikIO doesn't work with GeoTIFFs yet unfortunately, someone needs to implement that with cuFile and all. But I'm wondering if there's a way to hack a way together using kerchunk's tiff_to_zarr.