sul-dlss / libsys-airflow

Airflow DAGS for migrating and managing ILS data into FOLIO along with other LibSys workflows
Apache License 2.0
5 stars 0 forks source link

Check for tag uniqueness not working right #1375

Closed shelleydoljack closed 3 weeks ago

shelleydoljack commented 3 weeks ago

There is still something not quite right about checking the tag uniqueness before adding.

After #1373 got merged, I triggered the digital_bookplate_979 DAG with the following config:

{
  "druids_for_instance_id": {
    "b6e80385-3be3-4e84-a530-d89d072af46b": [
      {
        "fund_name": "STEINMETZ",
        "druid": "nc092rd1979",
        "image_filename": "nc092rd1979_00_0001.jp2",
        "title": "Verna Pace Steinmetz Endowed Book Fund in History"
      },
      {
        "fund_name": "WHITEHEAD",
        "druid": "ph944pq1002",
        "image_filename": "ph944pq1002_00_0001.jp2",
        "title": "Barry Whitehead Memorial Book Fund"
      }
    ]
  }
}

https://sul-libsys-airflow-stage.stanford.edu/dags/digital_bookplate_979/grid?run_id=manual__2024-10-29T16%3A17%3A21%2B00%3A00&execution_date=2024-10-29+16%3A17%3A21%2B00%3A00&tab=graph&dag_run_id=manual__2024-10-29T16%3A17%3A21%2B00%3A00

The result record now has two 979 tags for WHITEHEAD (which was previously added when we ran all the dags 10/28/2024).

shelleydoljack commented 3 weeks ago

The problem here https://github.com/sul-dlss/libsys-airflow/blob/2be91f76510495e8cb0bb747004bdc817b9a07b2/libsys_airflow/plugins/shared/utils.py#L141-L151 is that fields could be list of the following pymarc.Field objects:

979     ‡f STEINMETZ ‡b druid:nc092rd1979 ‡c nc092rd1979_00_0001.jp2 ‡d Verna Pace Steinmetz Endowed Book Fund in History
979     ‡f WHITEHEAD ‡b druid:ph944pq1002 ‡c ph944pq1002_00_0001.jp2 ‡d Barry Whitehead Memorial Book Fund 
979     ‡f WHITEHEAD ‡b druid:ph944pq1002 ‡c ph944pq1002_00_0001.jp2 ‡d Barry Whitehead Memorial Book Fund

When new_field could be STEINMETZ 979 or WHITEHEAD 979 depending on when the function is called in the loop here https://github.com/sul-dlss/libsys-airflow/blob/2be91f76510495e8cb0bb747004bdc817b9a07b2/libsys_airflow/plugins/shared/utils.py#L109

When looping through the first existing field, and the new STEINMETZ field, we will get a "skip adding" but in the second loop of existing field, we compare WHITEHEAD to new STEINMETZ and say "tag is unique".