sul-dlss / libsys-airflow

Airflow DAGS for migrating and managing ILS data into FOLIO along with other LibSys workflows
Apache License 2.0
5 stars 0 forks source link

Create a function to add a MARC tag to data in FOLIO #1185

Closed shelleydoljack closed 3 days ago

shelleydoljack commented 1 month ago

This function should be written so that it can be used by different DAGs that will have a task to "add some data to a MARC record" in folio. The input for this will be an instance UUID and the data to be added (probably a list of fields, such as [{'979': {'ind1': ' ', 'ind2': ' ', 'subfields': [{'f': 'ABBOTT'}, {'b': 'druid:ws066yy0421'}, {'c': 'ws066yy0421_00_0001.jp2'}, {'d': 'The The Donald P. Abbott Fund for Marine Invertebrates'}]}}]). We will use this function to add 979 bookplate data to a MARC record. We can look at the OCLCWrapperAPI Class for an idea of the endpoints to use. (And maybe refactor OCLCWrapperAPI to use the new function.) There is also some information in the instances paid on bookplate funds notebook (see Get the MARC for each instance to do a PUT to add 979). We can query change-manager/parsedRecords to get the MARC that needs the tag data added. We also need to query inventory/instances to get the _version field to PUT to change-manager/parsedRecords. The function should account for overwriting existing fields in the record (if the incoming data is the same or adding the field to the MARC record. If there are errors in updating the record, these should be caught in an exception. If the error is due to optimistic locking, we should capture that for a retry after getting an updated _version from inventory.

jgreben commented 1 week ago

How do we know if the error is due to optimistic locking?

shelleydoljack commented 1 week ago

How do we know if the error is due to optimistic locking?

I'm not entirely sure of the http status or response body, but you can experiment with doing a PUT that contains a _version value that is different from what is in inventory. It looks like maybe "Failed to update parsed record with id" will be in the message.

shelleydoljack commented 1 week ago

@jgreben this is what we talked about in standup this morning regarding mapped tasks for this function so that it retries adding 979 to one record and not retries the whole list: https://github.com/sul-dlss/libsys-airflow/issues/1186#issuecomment-2394173810

I think here is an example in the DAG that groups the tasks: https://github.com/sul-dlss/libsys-airflow/blob/9638fe19c5791834f37f3d9b20d62690bbccc26d/libsys_airflow/dags/digital_bookplates/fetch_digital_bookplates.py#L47-L49

And this is the task that processes each task instance: https://github.com/sul-dlss/libsys-airflow/blob/9638fe19c5791834f37f3d9b20d62690bbccc26d/libsys_airflow/plugins/digital_bookplates/purl_fetcher.py#L111-L140 I think maybe airflow "magic" turns each object of a list into a task instance.

shelleydoljack commented 1 week ago

Actually, after re-reading that comment, he is saying to make it a separate dag run. So maybe not quite the same thing as what we were saying about mapping each list object to a task instance.

shelleydoljack commented 1 week ago

When a MARC record fails to update after 3? retries, then we want to add the instance ID and 979 data to a "failures" list to be consumed by a downstream task of emailing when the DAG is done running. ~We don't want to whole DAG to fail and downstream tasks to fail.~