Open AetherUnbound opened 7 months ago
I've opened #4102 to help us reproduce this. Once that's merged, we should run the DAG again and see if it fails in the same place. If it does, we can continue to troubleshoot. If it doesn't, we can close this and reopen if it comes up again.
Confirmed (now that initial_query_params
works!) that this fails locally when starting with the params {"page": 780}
. Locally by the time I tested, the error was actually happening on page 781, possibly because more records were added before I tested.
I have emailed Stocksnap about this issue.
Airflow log link
https://airflow.openverse.org/log?execution_date=2024-03-01T00%3A00%3A00%2B00%3A00&task_id=ingest_data.pull_image_data&dag_id=stocksnap_workflow&map_index=-1
Description
The Stocksnap DAG encountered an error during ingestion:
On top of this, Stocksnap uses a page counter instead of normal query params, so it's difficult to determine which page it failed on:
https://github.com/WordPress/openverse/blob/852e6b7d852728a7213f860d0a8657f06c584e00/catalog/dags/providers/provider_api_scripts/stocksnap.py#L45
In addition to resolving this issue, we should try and alter the DAGs that don't normally use query parameters so they still have something to report when they fail.
DAG status
Unchanged for now since this is a monthly DAG