manaakiwhenua / dggs-lu-tex

Latex files for paper submitted to Big Earth Data: "Using a DGGS for a scalable, interoperable, and reproducible system of land-use classification"
0 stars 0 forks source link

Page 11 Line 3 #4

Closed alpha-beta-soup closed 3 months ago

alpha-beta-soup commented 3 months ago

I don't see why DGGS may perform better compared with raster tiles, maybe some illustration is needed.

alpha-beta-soup commented 3 months ago

We don't argue in the paper that it does perform better. Rather we're explicit that they can be equivalent, and this is substantiated by our benchmark results.

P9, L10-19, emphasis added:

The results (Figure 6) demonstrate equivalence between these techniques, which is expected as the conceptual similarity between how we use the DGGS data model and the raster data model is much greater than in the vector case. Classification of DGGS data is one order of magnitude faster than the raster method due to benefits accruing primarily to using a columnar data store (Apache Parquet) classified with a multi-threaded online analytical processing (OLAP) query engine (Polars) which can easily apply classification rules in parallel across all DGGS zones (Vink et al., 2024). The raster method could have been optimised further, so the difference is not considered significant; rather this demonstrates no expected performance penalty for using DGGS over raster data models to perform equivalent work.

P10, L35-36:

A DGGS workflow therefore has the spatial alignment benefits of a raster workflow, but fewer downsides.

This is in the context of arguing that a DGGS is akin to a raster in terms of the discretisation of space; partitioning is akin to raster windows; etc. However there is a loss of some rich data types, particularly text data, when converting an entire workflow to the raster data model. This is not a performance benefit, but rather a benefit that applies more to analytical flexibility, and maintenance of any classification code.