Open mjordan opened 8 years ago
Looking at this a bit more, we might be able to implement "writer manipulators" following the pattern we have already established, which would, in this use case, modify the output path that a set of ingest files is written to. In this case, the output path would be based on the set the record was a member of. We'd need to have some way to relate each object to a set and corresponding directory, but a helper script run prior to the MIK job could issue a list sets request to the OAI provider, and then loop through each set and write out the identifiers for each object and associate them with the current set in a set registry (basically a text file). The writer manipulator could then refer to this list and modify the current object's output path using its entry in the set registry.
The writer manipulator's entry in the .ini file would look like writermanipulators[] = "OaiSetMembership|/tmp/set_registry.txt"
where the parameter is the path to the set registry generated by the helper script.
Then again, a much simpler approach would be to not introduce writer manipulators but to have the helper script offer an option that organized the harvested content into set-based subdirectories after the fact. This is probably the preferable approach until there are additional use cases for writer manipulators.
I was wrong about determining a record's set membership. Its setSpecs are included in both GetRecord
requests and ListRecord
requests.
Working through a use case with someone performing a migration from an OAI repository, please stay tuned.....
Potentially related issue: #338.
Many OAI-PMH repositories such as Digital Commons (and Islandora itself) express collection membership using OAI sets. The OAI-PMH protocol, however, doesn't provide a way to determine an object's set membership; in other words, once you have exported migrated an object out of the repository, you can't determine which OAI set (or collection) it was a member of.
The OAI toolchain should provide an option to respect the set structure of the source repository. This would allow for migrations that retain collection membership.
Perhaps we can implement this using a fetcher manipulator?