Open justb4 opened 6 years ago
some impact assement:
1] what database changes are needed? 2] what code changes are needed? 3] What are the performance benefits?
a concern: the SensorThings Observation entity has time range properties (ValidTime, PhenomenonTime). Normally the Postgres column type 'tsrange' is used for time ranges, but I read TimeScale DB does not use these? https://github.com/timescale/timescaledb/issues/277
I would expect performance benefits in queries and aggregations on time-fields. Many apps use nosql or InfluxDB databases, but these are usually weak in geospatial support. I would expect what I call "best of both worlds" (temporal and spatial) with TimescaleDB, like in inserts and queries.
The impact: hard to give absolute assessments, have no working experience with TimescaleDB. I would suggest to focus first on indexing (as in #3 and gost/server#152) as that is simple to implement and measure and to perform a PoC with TimescaleDB.
did some quick experiments with Timescale with a 10 million records NYC taxi table (https://docs.timescale.com/v0.11/tutorials/tutorial-hello-nyc).
Commands used: https://gist.github.com/bertt/2faf910e5417c48e9a3a57266fe53a8a
Some notes:
if we want to experiment with Timescale and GOST the following actions could be taken:
created experimental branch for this https://github.com/gost/gost-db/tree/timescale_db
changes:
docker image based on timescale/timescaledb-postgis:latest-pg10 instead of mdillon/postgis
schema creation sql script:
1] add 'create extension timescaledb'
2] changed table Observation, moved period phenomenonTime from data jsonb to columns phenomenonTime_start and phenomenonTime_end
3] Create hypertable using column phenomenonTime_ start from Observation:
SELECT create_hypertable('observation', 'phenomenontime_start',
chunk_time_interval => interval '1 day');
todo: check if this works on rpi too
next step: get application working with these database changes
(with @tomvantilburg at OSGeo.org codesprint Bonn) : when going to millions or billions of Observations use of a time-series DB like InfluxDB or Prometheus may be beneficial. But PostgreSQL with PostGIS remains a strong option for geospatial data (e.g. InfluxDB has no official geospatial facilities).
To combine the best of both worlds (time-series and geospatial) TimeScale DB (with PosGIS) may be considered as a spatiotemporal DB solution for GOST.