rustprooflabs / pgosm-flex

PgOSM Flex provides high quality OpenStreetMap datasets in PostGIS (Postgres) using the osm2pgsql Flex output.
MIT License
101 stars 20 forks source link

Allow indexes (SP-GIST and others) to be set more selectively #286

Closed rustprooflabs closed 1 year ago

rustprooflabs commented 1 year ago

Details

Version 0.6.3 added the --sp-gist option to switch from standard GIST indexes to SP-GIST indexes. Currently this is an all-or-nothing choice, meaning you cannot enable --sp-gist for a single table or sub-set of tables. This line of thought led me to consider customizing all indexes. Now that index creation is fully configurable via the Lua styles w/out having to use post-processing SQL it shouldn't be too difficult to add the ability to enable/disable specific index creation.

Why

I assume the advantages from SP-GIST indexes will not be an across the board win, so an all-or-nothing option may fall short. The same is true for all other indexes. If you're never filtering objects by name or osm_type, creating those indexes only takes longer to run the import and consumes more disk space. This becomes especially important on larger regions.

How

My initial idea is to add a configuration file (optional at run time) that allows customizing indexes per table. I imagine this would be another .ini file like under pgosm-flex/flex-config/layerset/* . Likely a new helper method would be created, e.g. get_table_indexes(table_name) that would compare the table name to the configuration, setting defaults. Each table creation would need to be adjusted to use this new helper.

A simple ini example might look like the following to configure only the sp-gist aspect.

[gist-indexes]
place_polygon_spgist=true
place_point_spgist=false

Now that index creation is fully (or near fully) configurable via the Lua styles, I imagine it might be handy to have the ability to enable/disable specific index creation. That would require a slightly more complicated ini setup, it seems that making a section per table would be an approach to consider.

[place_point]
spgist=false
index_osm_type=true
index_name=true

[place_polygon]
spgist=true
index_osm_type=false
index_name=true