gost / gost-db

MIT License
0 stars 2 forks source link

Suggestion: use TimeScaleDB #4

Open justb4 opened 6 years ago

justb4 commented 6 years ago

(with @tomvantilburg at OSGeo.org codesprint Bonn) : when going to millions or billions of Observations use of a time-series DB like InfluxDB or Prometheus may be beneficial. But PostgreSQL with PostGIS remains a strong option for geospatial data (e.g. InfluxDB has no official geospatial facilities).

To combine the best of both worlds (time-series and geospatial) TimeScale DB (with PosGIS) may be considered as a spatiotemporal DB solution for GOST.

bertt commented 5 years ago

some impact assement:

1] what database changes are needed? 2] what code changes are needed? 3] What are the performance benefits?

bertt commented 5 years ago

a concern: the SensorThings Observation entity has time range properties (ValidTime, PhenomenonTime). Normally the Postgres column type 'tsrange' is used for time ranges, but I read TimeScale DB does not use these? https://github.com/timescale/timescaledb/issues/277

justb4 commented 5 years ago

I would expect performance benefits in queries and aggregations on time-fields. Many apps use nosql or InfluxDB databases, but these are usually weak in geospatial support. I would expect what I call "best of both worlds" (temporal and spatial) with TimescaleDB, like in inserts and queries.

The impact: hard to give absolute assessments, have no working experience with TimescaleDB. I would suggest to focus first on indexing (as in #3 and gost/server#152) as that is simple to implement and measure and to perform a PoC with TimescaleDB.

bertt commented 5 years ago

did some quick experiments with Timescale with a 10 million records NYC taxi table (https://docs.timescale.com/v0.11/tutorials/tutorial-hello-nyc).

Commands used: https://gist.github.com/bertt/2faf910e5417c48e9a3a57266fe53a8a

Some notes:

bertt commented 5 years ago

if we want to experiment with Timescale and GOST the following actions could be taken:

bertt commented 5 years ago

created experimental branch for this https://github.com/gost/gost-db/tree/timescale_db

changes:

1] add 'create extension timescaledb'

2] changed table Observation, moved period phenomenonTime from data jsonb to columns phenomenonTime_start and phenomenonTime_end

3] Create hypertable using column phenomenonTime_ start from Observation:

SELECT create_hypertable('observation', 'phenomenontime_start',
    chunk_time_interval => interval '1 day');

todo: check if this works on rpi too

next step: get application working with these database changes