dewi-alliance / dewi-etl

Publicly available Helium blockchain data - https://etl.dewi.org
MIT License
9 stars 1 forks source link

Optimize large/slow database tables #11

Open jamiew opened 3 years ago

jamiew commented 3 years ago

Some of the etl-queries tables are pretty big now, particularly challenge_receipts_parsed

Easy things might be tuning indexes. That one has both address and name for witness and transmitter, could just be address and always use subselects for names

Would more recent postgres versions or alternate database storage solutions help with others?

Related here is the pg_stat results on old DeWi ETL: https://etl.dewi.org/question/117-slow-queries-pg-stat-statements + xlsx dump query_result_2021-08-07T10 45 38.985426-04 00.xlsx

Here are all table sizes

select table_name, pg_relation_size(quote_ident(table_name)) 
from information_schema.tables
where table_schema = 'public'
order by 2 desc;
          table_name           | pg_relation_size
-------------------------------+------------------
 gateways                      |     115344318464
 transactions                  |      61467549696
 transaction_actors            |      54581166080
 challenge_receipts_parsed     |      28294864896
 rewards                       |      18037063680
 challenge_receipts_parsed_old |      13214097408
 accounts                      |       6230302720
 gateway_inventory             |       4090748928
 packets                       |       2231975936
 block_signatures              |       2152873984
 gateway_status                |       1353596928
 dc_burns                      |       1217060864
 validators                    |       1016700928
 account_inventory             |        282730496
 blocks                        |        182550528
 validator_inventory           |        176136192
 validator_status              |         99999744
 locations                     |         42696704
 spatial_ref_sys               |          4694016
 stats_inventory               |          1777664
 oracle_prices                 |           786432
 ouis                          |           360448
 vars_inventory                |           204800
 oui_inventory                 |            57344
 oracle_inventory              |             8192
 oracle_price_predictions      |             8192
 cheatnets                     |             8192
 __migrations                  |             8192
mistakeNot7 commented 2 years ago

I would suggest 2 things, before going into index tunning: timescaledb extension and table partitioning. I found these 2 to be more helpful with big tables, than indexes which are also growing in size as time passes.