blockchain-etl / bitcoin-etl

ETL scripts for Bitcoin, Litecoin, Dash, Zcash, Doge, Bitcoin Cash. Available in Google BigQuery https://goo.gl/oY5BCQ
https://twitter.com/BlockchainETL
MIT License
411 stars 123 forks source link

How to speed up the enrich process #31

Closed jsvisa closed 4 years ago

jsvisa commented 5 years ago

In the export_all process, seems the enriching process took the most time, how to speed up this process?

medvedev1088 commented 5 years ago

You are right, it's a very time consuming operation. For the historical export I disable enrichment during export and do the enrichment afterwards in BigQuery. This is the SQL I use in my Airflow DAG: https://github.com/blockchain-etl/bitcoin-etl-airflow/blob/master/dags/resources/stages/enrich/sqls/transactions.sql. It runs for just a few minutes. The enriched transactions can then be exported to JSON files in GCS.

jsvisa commented 5 years ago

Thanks, by now, I pop All the UTXO outputs into Redis as cache, so every input may be found in the Redis cache, this speed the whale process.