Open AetherUnbound opened 1 year ago
Unfortunately, the new 16 hour timeout appeared not to be enough during a recent run 😞 (although it got through many more tasks!)
@stacimc I'm noticing a bit of a discrepancy, it seems that most of the reingestion workflows are weekly but we have them time out at 23 hours:
Europeana appears to be consistently hitting that 23 hour limit. Do you think it makes sense to extend that, given that it's weekly? Or are there other reasons why we'd want the reingestion to always take less than a day?
The reingestion workflows were originally intended to be run daily, which is why they had the 23 hour limit. When I first enabled them, I wanted to start them out on a weekly schedule and then adjust them as necessary.
Flickr we’ll definitely still want to upgrade to daily once it’s up and running (I have an issue tracking that), but for Europeana and the others I’m not as opinionated. Since Europeana is taking so long I think extending the timeout and keeping it weekly seems like a good idea 👍
Description
We have a number of Europeana reingestion days which are timing out at 12 hours. We should potentially increase the execution timeout of these runs to 16 hours.
https://github.com/WordPress/openverse-catalog/blob/92771e3bd6150c3dd634c34c235ddffc18260192/openverse_catalog/dags/providers/provider_reingestion_workflows.py#L64
Additional context
List of failing tasks:
https://airflow.openverse.engineering/taskinstance/list/?_flt_3_dag_id=europeana_reingestion_workflow&_flt_3_state=failed&_flt_2_end_date=2023-01-09+19%3A22%3A43%2B00%3A00