WordPress / openverse

Openverse is a search engine for openly-licensed media. This monorepo includes all application code.
https://openverse.org
MIT License
254 stars 204 forks source link

Increase Europeana reingestion timeout #1300

Open AetherUnbound opened 1 year ago

AetherUnbound commented 1 year ago

Description

We have a number of Europeana reingestion days which are timing out at 12 hours. We should potentially increase the execution timeout of these runs to 16 hours.

https://github.com/WordPress/openverse-catalog/blob/92771e3bd6150c3dd634c34c235ddffc18260192/openverse_catalog/dags/providers/provider_reingestion_workflows.py#L64

Additional context

List of failing tasks:

https://airflow.openverse.engineering/taskinstance/list/?_flt_3_dag_id=europeana_reingestion_workflow&_flt_3_state=failed&_flt_2_end_date=2023-01-09+19%3A22%3A43%2B00%3A00

AetherUnbound commented 1 year ago

Unfortunately, the new 16 hour timeout appeared not to be enough during a recent run 😞 (although it got through many more tasks!)

@stacimc I'm noticing a bit of a discrepancy, it seems that most of the reingestion workflows are weekly but we have them time out at 23 hours:

https://github.com/WordPress/openverse-catalog/blob/af205795b412da9f3d0d7f6d7e066b7d039b1763/openverse_catalog/dags/providers/provider_reingestion_workflows.py#L44-L47

Europeana appears to be consistently hitting that 23 hour limit. Do you think it makes sense to extend that, given that it's weekly? Or are there other reasons why we'd want the reingestion to always take less than a day?

stacimc commented 1 year ago

The reingestion workflows were originally intended to be run daily, which is why they had the 23 hour limit. When I first enabled them, I wanted to start them out on a weekly schedule and then adjust them as necessary.

Flickr we’ll definitely still want to upgrade to daily once it’s up and running (I have an issue tracking that), but for Europeana and the others I’m not as opinionated. Since Europeana is taking so long I think extending the timeout and keeping it weekly seems like a good idea 👍