timescale / timescaledb-parallel-copy

A binary for parallel copying of CSV data into a TimescaleDB hypertable
https://www.timescale.com/
Apache License 2.0
371 stars 55 forks source link

Performance tuning tips #54

Open brettwooldridge opened 2 years ago

brettwooldridge commented 2 years ago

This is rather an open question. I am looking for any experience/tips anyone has regarding performance tuning to increase import speed. We are trying to import several billion rows of data from InfluxDB to TimescaleDB.

Batch size? Copy options? PostgreSQL tuning parameters?

jonatas commented 2 years ago

Here are a few good tips: https://www.timescale.com/blog/13-tips-to-improve-postgresql-insert-performance/

Also, there's some good ideas for large datasets here: https://github.com/timescale/timescaledb-extras#useful-utilities

I'd also encourage you to enable the --report and start testing with different batch sizes and parallel processes. It will depend on the machine you're using.