maplibre / martin

Blazing fast and lightweight PostGIS, MBtiles and PMtiles tile server, tile generation, and mbtiles tooling.
https://martin.maplibre.org
Apache License 2.0
2.03k stars 192 forks source link

Store hashing algorithm in the metadata table #1086

Open nyurik opened 7 months ago

nyurik commented 7 months ago

mbtiles and martin-cp standardize how a tile data is hashed in the mbtiles. tilelive-copy has been generating normalized mbtiles schema with MD5 hashes of the tile data as the table foreign key. So it was possible to validate the content of the tile by re-computing MD5 hash.

tippecanoe (cc @bdon) has used a faster fnv1a checksum algorithm, so clearly there is need for more than one algo.

Proposal

Multiple Algorithms Consideration

MBTiles use hashes for the tile_data column and for the agg_tiles_hash metadata field. In theory these could use different algorithms for performance and security(?) reasons. If so, we could either have two metadata fields (hash_algorithm and agg_hash_algorithm), or better yet, rename agg_tiles_hash into agg_tiles_hash_md5 (or other algo). This keeps the algorithm and its value together, and is fairly easy to detect at runtime.

bdon commented 7 months ago

faster fnv1a checksum algorithm

In tippecanoe we needed a content hash that had the smallest implementation to inline, unsure how the speed compares to md5, but if you need speed you'll probably want something like xxhash anyways.

nyurik commented 7 months ago

@bdon this proposal is to support and standardize all such algorithms, so that if tippecanoe created mbtiles file, other tools can validate it and possibly add more content, while using the same algo as the original