Open jacobthill opened 1 month ago
@jacobthill I'm a bit confused on this one. How will we know which objects to harvest if they aren't in a collection manifest? The work for AUB is based on the collection manifest offering the list of objects to query, so it doesn't sound like that will be a good solution here. Thanks.
I will provide a list of items in the catalog. Here is an example with iiif https://github.com/sul-dlss/dlme-airflow/pull/571
Archive.org now supports IIIF but there are few, if any, collections we would want to grab in whole. There are many objects that we would want to harvest but it would require manually reviewing lists of items and compiling a list of item level IIIF manifests. The might be solved by the work @aaron-collier is currently doing to harvest batched IIIF collections for AUB.
This will be critical for building some important browse categories where we need to select specific items that are only available in archive.org.
This would also solve https://github.com/sul-dlss/dlme-airflow/issues/540, https://github.com/sul-dlss/dlme-airflow/issues/541, and would enable us to add several Stanford collections that don't have collection level IIIF manifests.