Closed Ozxahmed closed 8 months ago
Here's the base code for scheduling from the lectures:
import schedule
import time
#Example
def job():
print("I'm working...")
schedule.every(10).minutes.do(job)
schedule.every().hour.do(job)
schedule.every().day.at("10:30").do(job)
schedule.every().monday.do(job)
schedule.every().wednesday.at("13:15").do(job)
schedule.every().day.at("12:42", "Europe/Amsterdam").do(job)
schedule.every().minute.at(":17").do(job)
while True:
schedule.run_pending()
time.sleep(1)
##A schedule is set by doing `schedule.every(X).minute.do(job)`. When the schedule needs to execute, it will run the `job()` function.
##The code above polls every second using `time.sleep(1)` to check if the job needs to be run.
I thought we could do something like schedule the ETL to run every day and check everyday to see records have been updated. I think this will work, but could be a good case for a unit test:
def job1():
print("I'm working...")
schedule.every().day.do(job)
while True:
schedule.run_pending()
time.sleep(86400)
` schedule.every(1).minutes.do(get_max_date_crime_data, APP_TOKEN=APP_TOKEN)
while True: schedule.run_pending() print("I'm working...") time.sleep(60) `
I pushed my mke-pipeline branch with the pipeline scheduling. Since it's not completely working, the changes were implemented in a copy of the api_connection.py file. I'll add more comments to the slack conversation.
This issue is for code to figure out how to schedule our pipeline, along with writing metadata logs to a database table.