sul-dlss / libsys-airflow

Airflow DAGS for migrating and managing ILS data into FOLIO along with other LibSys workflows
Apache License 2.0
5 stars 0 forks source link

Instance with duplicated 979s sent to digital_bookplate_979 DAG #1390

Closed shelleydoljack closed 1 day ago

shelleydoljack commented 1 week ago

For this dag run id manual2024-10-30T20:53:15.986083+00:00 the [XCOM](https://sul-libsys-airflow-dev.stanford.edu/dags/digital_bookplate_979/grid?run_id=manual__2024-10-30T20%3A53%3A15.986083%2B00%3A00&execution_date=2024-10-30+20%3A53%3A15.986083%2B00%3A00&dag_run_id=manual2024-10-30T20%3A53%3A15.986083%2B00%3A00&task_id=retrieve_druids_for_instance_task&tab=xcom) for retrieve_druid_for_instance_task show that the instance has the same fund 5 times:

{
  "66d4d0a9-f4bc-58de-a4b9-02b0f439ee65": [
    {
      "fund_name": "LINDER",
      "druid": "xg474qk4925",
      "image_filename": "xg474qk4925_00_0001.jp2",
      "title": "The Doris H. Linder Book Fund"
    },
    {
      "fund_name": "LINDER",
      "druid": "xg474qk4925",
      "image_filename": "xg474qk4925_00_0001.jp2",
      "title": "The Doris H. Linder Book Fund"
    },
    {
      "fund_name": "LINDER",
      "druid": "xg474qk4925",
      "image_filename": "xg474qk4925_00_0001.jp2",
      "title": "The Doris H. Linder Book Fund"
    },
    {
      "fund_name": "LINDER",
      "druid": "xg474qk4925",
      "image_filename": "xg474qk4925_00_0001.jp2",
      "title": "The Doris H. Linder Book Fund"
    },
    {
      "fund_name": "LINDER",
      "druid": "xg474qk4925",
      "image_filename": "xg474qk4925_00_0001.jp2",
      "title": "The Doris H. Linder Book Fund"
    }
  ]
}

The result is that the record in folio-test https://folio-test.stanford.edu/inventory/viewsource/66d4d0a9-f4bc-58de-a4b9-02b0f439ee65 ended up with duplicate 979s. I'm not sure if this is because we had to retry this task several times but that shouldn't matter, the subsequent dag runs when adding 979s should have not added duplicate 979s.

shelleydoljack commented 1 week ago

There are 2 po lines with this instance Id /orders-storage/po-lines?query=instanceId==66d4d0a9-f4bc-58de-a4b9-02b0f439ee65, po line numbers HOGARTHBLANKET-1 and HOGARTHNRBLANKET-1. These package orders are suppressed from discovery so it probably isn't a big deal that the MARC got duplicated 979s in folio-test. But we probably could be better about not adding the same po line in the bookplate_fund_polines task.

shelleydoljack commented 1 week ago

I found one in the LINDER fund where the instance is not suppressed. In searchworks-stage the 2 979's are de-duped: https://searchworks-stage.stanford.edu/view/in00000115693

shelleydoljack commented 1 week ago

I reviewed all the emails to add 979 and I couldn't find more examples where acquisitions has multiple invoice lines with the same fund linked to the same po line and therefore same instance ID. It seems maybe LINDER fund is an exception? In any case, we probably should still de-dup in the bookplate_fund_polines task so that we aren't hitting orders-storage/po-lines for the same po line ID unnecessarily.