Problem: pipelines can't split transfers automatically

sevein commented 5 years ago

Back in 2014 we received a pull request from @pbrantner suggesting a simple method to split transfers automatically taking advantage of watched directories. A pull request was submitted but we've never been able to analyze it in depth.

https://github.com/artefactual/archivematica/pull/99

The pull request has been inactive for more than three years. I'm filing this issue to capture the feature request and create a space where we can discuss other solutions, e.g. could this be achieved via automation tools and would it be preferably?

For Artefactual use: Please make sure these steps are taken before moving this issue from Review to Verified in Waffle:

All PRs related to this issue are properly linked 👍
All PRs related to this issue have been merged 👍
Test plan for this issue has been implemented and passed 👍
Documentation regarding this issue has been written and it has been added to the release notes, if needed 👍

sromkey commented 5 years ago

I'm not sure what is meant by "split transfers" in this context and how they would be split? But seems worth discussing/considering.

sevein commented 5 years ago

From what I understood the workflow suggested by https://github.com/artefactual/archivematica/pull/99 is the following:

User with physical access to the Archivematica Shared Directory (usually /var/archivematica/sharedDirectory) moves a new transfer into watchedDirectories/activeTransfers/splittedTransfers.

Archivematica has a new watcher pointing to splittedTransfers. When the user adds a new directory, the client script provided in the pull request will start a new transfer for each directory found in the root of the directory provided by the user. For example, the user adds FOOBAR with three directories inside: 1, 2 and 3.

/var/archivematica/sharedDirectory/watchedDirectories/activeTransfers/splittedTransfers/FOOBAR
├── 1
│   └── [contents]
├── 2
│   └── [contents]
└── 3
    └── [contents]

The script creates three transfers: SIP-1, SIP-2, SIP-3.

My guess is that the mechanisms to address this use case would be very different today, e.g. I don't think we'd want users to work with watched directories and we would probably be looking at exposing the functionality via the API instead, and integrated with SS transfer source locations?

archivematica / Issues

Problem: pipelines can't split transfers automatically #224