timescale / timescaledb

An open-source time-series SQL database optimized for fast ingest and complex queries. Packaged as a PostgreSQL extension.
https://www.timescale.com/
Other
17.35k stars 870 forks source link

Timescaledb for edge devices #1556

Open archerian opened 4 years ago

archerian commented 4 years ago

General question, I've seen performance metrics for (dedicated) server grade data, how suited is Timescaledb for managing data on lower powered HW, like single/dual core CPUs with 4-8GB RAM on a containerized system?

mfreed commented 4 years ago

We test timescaleDB on low resource ARM32-bit devices as part of build process, and we have production deployments on edge IOT devices akin to Raspberry Pi. So certainly your config should be fine!

A good testing platform for you would actually be the dev instances you can use on Timescale Cloud, which have 2vCPU (one physical core) and 4GB RAM.

wrobell commented 4 years ago

I am in the process of migrating my edge processing framework and projects from HDF5 to TimescaleDB.

When looking into TimescaleDB, I was considering

I have three Raspberry Pi devices

As you can see, TimescaleDB has certain limitations and it needs more work to use all PostgreSQL features, but despite these limitations, after about 3 weeks of using it, I find its current version as very good storage solution for edge processing projects.

mfreed commented 4 years ago

Hi @wrobell thanks for the detailed report.

Would you mind summarizing what you think TimescaleDB's limitations are given the above? Overall it seemed like a pretty strong argument for TimescaleDB, and the only thing I could notice is one deployment where TimescaleDB compressed from 1.2GB to 70MB versus ~35NB for HSF5.

wrobell commented 4 years ago

The compression example I provided is probably like comparing apples to oranges - different chunks, not sure about the compression parameters or even algorithms itself. The difference in this particular example did not matter, so I decided not to investigate it more and I accepted it as it is. I should be more clear about this.

The limitations of TimescaleDB

wrobell commented 4 years ago

I have finally migrated my project, which fetches ADB/Mode-S messages on Raspberry Pi Zero, to use TimescaleDB. It has over 520 million entries and takes 6.5 GB of storage (45 GB uncompressed, my initial assessment was more pessimistic).

This project also exposes certain limitation of current TimescaleDB chunking approach. It is time based (i.e. hour, day). I prefer to have chunks as equally as possible to be filled with data, so I had to choose 1 day chunk size. 1 hour chunks would contain from zero to over 200 thousands of records. The attached screenshot should illustrate the problem - it shows number of ADB/Mode-S messages received per time. Maybe count based chunking will be possible in the future as well?

sq1090