sul-dlss / dlme-airflow

This is a new repository to capture the work related to the DLME ETL Pipeline and establish airflow
Apache License 2.0
1 stars 0 forks source link

add ucla collections and related post harvest task, fixes #427 #431

Closed jacobthill closed 1 year ago

jacobthill commented 1 year ago

I considered several options for implementing this, particularly using the existing drivers for the second data source. I realized, however, that the only case we have for merging two data sources is when one of them is IIIF v3. It would be great to merge IIIF v2 sources with their OAI-PMH counterparts (we have some cases for that) but the IIIF v2 sources lack the thumbnail key which is needed to merge the two data frames. We've never had another case for merging two data sources. So considering that if we used the iiif_json driver, we would need to make the catalog considerable more complex (we would double all of the fields, paths, etc.), we would need to modify the drive to look for different data settings, and we would need to modify the tests, its seemed better to contain this in a single post harvest task for now. It should work on all IIIF v3 collections.