Open giggls opened 4 years ago
I have done a lot of refactoring on the code that creates the geometries for PostgreSQL/PostGIS in the last months. It should now be relatively straightforward to add in code for generating TWKB instead of the default EWKB. But it's not something high on anybody's priority list and it's unclear whether it is worth the extra complexity.
If somebody wants to tackle this: Look for the geom_to_ewkb()
function and a a geom_to_twkb()
which does basically the same thing only for TWKB. Then add some kind of flag so the user can choose which one is called from the Lua config. Please talk to us here first, though, before you spend a lot of time on it.
I have been using ST_AsTWKB function from PostGIS to generate the tables for my example. Maybe these can be re-used.
TWKB will only make sense for an epsg:4326 version of the database.
If I correctly understand the geom_to_ewkb function correctly, the geometry is passed in geom format which I assume would be floating-point, right?
I don't know if converting fixed-point raw osm to floating point and back to fixed-point (for twkb) will work well.
Unfortunately you can not use ST_AsTWKB()
with osm2pgsql, because we are using COPY
to import the data which doesn't allow putting a function in there. You could do something with a trigger though. If PostGIS would simply allow "magic" conversion of the normal geometry
type to twkb
it would simply work. I don't see any reason why that wouldn't be possible but PostGIS doesn't allow it.
Yes, the geometry classes in osm2pgsql use double precision floating point values for the coordinates. That's what you have to work with.
If PostGIS would simply allow "magic" conversion of the normal
geometry
type totwkb
it would simply work.
Would it be possible to define a generated column via the Lua style? If we could create generated columns it could be defined with ST_AsTWKB()
. Of course that duplicates the geometry in the database but I already do that in a variety of scenarios because the benefit is often worth that storage overhead.
The sole reason for using TWKB is to reduce table size thus adding an additional column would invalidate using TWKB at all.
TWKB is a multi-purpose format for serializing vector geometry data into a byte buffer, with an emphasis on minimizing size of the buffer. The format is already supported by PostGIS.
See https://github.com/TWKB/Specification/blob/master/twkb.md for details.
It is somewhat similar to PBF format.
While this will impact the database import and query speed in some way it will help to reduce the size of production tables in a significant way.
Importing a small extract of the planet I was able to reduce the size of the standard production tables by 36.3 % using seven digits after the decimal point, which should be the original resolution of the imported OSM data.
Here are the results of calling pg_relation_size on them:
A TWKB output option might well make sense using the generic backend as well as the new flex backend.