richardogoma / bitcoin-rate-etl

An ETL pipeline to ingest near-real time data of Bitcoin rates across major currencies (USD/GBP/EUR) from the CoinDesk Bitcoin Price Index API.
MIT License
0 stars 1 forks source link

Discontinuous timestamp in loaded data --persisting #17

Closed richardogoma closed 1 year ago

richardogoma commented 1 year ago

In addition to the resolution in #11 involving the setup_etl.sh script, and syncing the initial program trigger with the start of the next minute. It is true that when the nohup command is fired,

# Start the ETL pipeline in the background and append stdout to output.log
nohup nice -n 10 python3 -u etl_pipeline.py >> output.log 2>&1 &

There is a delay, maybe a minute, for Python to initialize and execute the first run. Subsequent runs are 60 secs plus the current time after the program execution, which includes delay. So the next run doesn't really start at the next minute, but 60 seconds after the last run.

# Wait for one minute before fetching the next update
time.sleep(60)

We have to programmatically calculate the appropriate delay in seconds, not just the blanket 60 seconds to solve this discontinuous timestamp issue.