WordPress / openverse

Openverse is a search engine for openly-licensed media. This monorepo includes all application code.
https://openverse.org
MIT License
240 stars 194 forks source link

Add `backoff` to Stocksnap DAG for 5XX errors #4878

Open AetherUnbound opened 3 weeks ago

AetherUnbound commented 3 weeks ago

Airflow log link

Note: Airflow is currently only accessible to maintainers & those given access. If you would like access to Airflow, please reach out to a member of @WordPress/openverse-maintainers.

https://airflow.openverse.org/dags/stocksnap_workflow/grid?dag_run_id=manual__2024-09-05T18%3A39%3A42%2B00%3A00&task_id=ingest_data.pull_image_data&base_date=2024-09-05T18%3A39%3A42%2B0000&tab=logs

Description

It looks like we're starting to encounter ephemeral 5XX errors with certain Stocksnap errors (separate from #4101). We should add a backoff.on_exception wrapper to the _get_filesize function of the ingestion class so these errors can be retried:

https://github.com/WordPress/openverse/blob/7f4fb7c91363747505d72a63dc7509617d16acfc/catalog/dags/providers/provider_api_scripts/stocksnap.py#L160

This can be done using a decorator like Freesound:

https://github.com/WordPress/openverse/blob/05f8c55df108e9af237bde8d4b8e2834fe71f89f/catalog/dags/providers/provider_api_scripts/freesound.py#L182

Although we'd want the check to look like the global one for Science Museum (see #4715)

https://github.com/WordPress/openverse/blob/81d8da840a732073af261c30924e524459d97370/catalog/dags/providers/provider_api_scripts/science_museum.py#L50-L59

Reproduction

Since these are temporal issues, I wasn't able to reproduce the 502 in the logs

DAG status

I've left this enabled, though I will add a silenced alert clause for this linking to this issue.

MarleaM commented 1 day ago

@AetherUnbound I'd love to work on this issue!

AetherUnbound commented 1 day ago

Hi @MarleaM, thank you for your interest in contributing to Openverse! I've assigned this issue to you. If you have any questions, you may leave them here.

Please check out our welcome and general setup documentation pages for getting started with setting up your local environment.