sul-dlss / libsys-airflow

Airflow DAGS for migrating and managing ILS data into FOLIO along with other LibSys workflows
Apache License 2.0
5 stars 0 forks source link

Trigger polling dag for re-running failed digital_bookplate_979 dag runs #1446

Open shelleydoljack opened 2 days ago

shelleydoljack commented 2 days ago

When we re-run failed digital_bookplate_979 dag runs (using the airflow UI by clearing the tasks), we need to somehow trigger the poll_for_digital_bookplate_979s_email dag with the dag run ID so that we get an email. The poll_for_digital_bookplate_979s_email dag is triggered with a configuration like:

{
    "dag_runs": [
        "manual__2024-10-30T20:38:05.622419+00:00",
        "manual__2024-11-07T19:05:01.569291+00:00",
        "manual__2024-11-07T19:05:01.667437+00:00",
    ],
    "email": null
}

Here is a sample request for querying for failed digital_bookplate_979 dag runs:

curl -X 'GET' \
  'https://sul-libsys-airflow-dev.stanford.edu/api/v1/dags/digital_bookplate_979/dagRuns?limit=100&state=failed' \
  -H 'accept: application/json'

It returns data like this:

{
  "dag_runs": [
    {
      "conf": {
        "druids_for_instance_id": {
          "3d67ccc4-5173-47b6-a6cc-011f24c93a65": [
            {
              "druid": "nh023fz6328",
              "fund_name": "HAMMONDA",
              "image_filename": "nh023fz6328_00_0001.jp2",
              "title": "Andrew B. Hammond Memorial Book Fund"
            }
          ]
        }
      },
      "dag_id": "digital_bookplate_979",
      "dag_run_id": "manual__2024-10-30T20:38:05.622419+00:00",
      "data_interval_end": "2024-10-30T20:38:05.622419+00:00",
      "data_interval_start": "2024-10-30T20:38:05.622419+00:00",
      "end_date": "2024-11-18T21:44:32.671466+00:00",
      "execution_date": "2024-10-30T20:38:05.622419+00:00",
      "external_trigger": true,
      "last_scheduling_decision": "2024-11-18T21:44:32.669285+00:00",
      "logical_date": "2024-10-30T20:38:05.622419+00:00",
      "note": null,
      "run_type": "manual",
      "start_date": "2024-11-18T21:43:19.603407+00:00",
      "state": "failed"
    },
...
],
  "total_entries": 14
}

In the normal workflow, the polling dag is triggered from the digital_bookplate_instances dag and from the upload form.

shelleydoljack commented 19 hours ago

Create a new re-run dag for dag runs in failed state for digital_bookplate_979. Schedule should be monthly and it triggers the polling dag with the list of dag run ID's.