Open AetherUnbound opened 3 weeks ago
@AetherUnbound I'd love to work on this issue!
Hi @MarleaM, thank you for your interest in contributing to Openverse! I've assigned this issue to you. If you have any questions, you may leave them here.
Please check out our welcome and general setup documentation pages for getting started with setting up your local environment.
Airflow log link
https://airflow.openverse.org/dags/stocksnap_workflow/grid?dag_run_id=manual__2024-09-05T18%3A39%3A42%2B00%3A00&task_id=ingest_data.pull_image_data&base_date=2024-09-05T18%3A39%3A42%2B0000&tab=logs
Description
It looks like we're starting to encounter ephemeral 5XX errors with certain Stocksnap errors (separate from #4101). We should add a
backoff.on_exception
wrapper to the_get_filesize
function of the ingestion class so these errors can be retried:https://github.com/WordPress/openverse/blob/7f4fb7c91363747505d72a63dc7509617d16acfc/catalog/dags/providers/provider_api_scripts/stocksnap.py#L160
This can be done using a decorator like Freesound:
https://github.com/WordPress/openverse/blob/05f8c55df108e9af237bde8d4b8e2834fe71f89f/catalog/dags/providers/provider_api_scripts/freesound.py#L182
Although we'd want the check to look like the global one for Science Museum (see #4715)
https://github.com/WordPress/openverse/blob/81d8da840a732073af261c30924e524459d97370/catalog/dags/providers/provider_api_scripts/science_museum.py#L50-L59
Reproduction
Since these are temporal issues, I wasn't able to reproduce the 502 in the logs
DAG status
I've left this enabled, though I will add a silenced alert clause for this linking to this issue.