Closed anthonyra closed 11 months ago
Also an additional thought to the above;
This query updates the gateway_status
table, I wonder if it'd also make sense to have a quick_sync_mode
that you can set that will skip this status update until the height you specify? That's dependent on if the data used for this INSERT
is simply built from the current ledger state found on the ETL?
https://github.com/helium/blockchain-etl/blob/4fb696f1ff203689eb6e076383f6efe651b495ef/src/be_db_gateway_status.erl#L90-L106
Upon looking into some performance issues I'm having right now with my ETL I noticed that the above
INSERT
is eating up a lot of time while being committed. (I'm not 100% sure if it's the only culprit but the one that stands out the most right now).Looking into optimizations and fixes I stumbled upon some suggestions about populating a DB; https://www.postgresql.org/docs/11/populate.html
With some benchmarks: https://www.depesz.com/2007/07/05/how-to-insert-data-to-database-as-fast-as-possible/
It appears that these
INSERTS
should be batched and even more so converted intoCOPY
if at all possible. Upon reviewing the code I noticed that an ETS is being used to store the requests prior to them being sent to Postgres. I'm not that well versed yet when it comes to Erlang and ETS but could that table be converted to either CSV or preferably the Binary format needed forCOPY
?I looked into epgsql to ensure that copy is supported and it appears that in 2020 it was added; https://github.com/epgsql/epgsql/blob/devel/test/epgsql_copy_SUITE.erl