samvera-labs / hybox-ideas

Community repository for documenting ideas and use cases for the Hydra-in-a-Box project.
6 stars 1 forks source link

batch ingest! #12

Open tomcramer opened 8 years ago

tomcramer commented 8 years ago

From Chealsye at CWRU: I recently joined Case Western and am new to Hydra. We're very interested in a batch ingest function for our repository. I've seen previous posts from LSE working on a batch ingest, Duke discussing batch ingest needs, and a post from WGBH stating that DCE is working on a batch ingest function for them in 2013. What progress has been made on a batch ingest function?

jcoyne commented 8 years ago

"Batch Ingest" is a vague term. Can you describe more by what you'd expect? Are we talking about a spreadsheet with 1 object per row and one metadata field per column? Does it get uploaded via the web? Where do the objects (images, videos, etc) come from? How do they get matched with the metadata? How do we deal with errors such as being unable to find the matching file, or validation issues with the metadata (duplicate id, missing title)?

jcoyne commented 8 years ago

Let me know if you want to talk more about this. I've written a batch importer for at least half a dozen institutions and I have yet to find a lot of commonality in the implementation for any of them. Everyone has their own specific requirements. I think the only way to implement this successfully is to begin by selecting a common metadata format that everyone is happy to use.

jimtuttle commented 7 years ago

I'd very much like to see this be something simple enough that we can point researchers at the documentation and they can build SIPs themselves. We've been doing this here: https://docs.google.com/document/d/1n0nSE3pejYaUF70UVCl4Oc6nZ18buK9QCKX2d_8BE44/edit?usp=sharing We're adding administrative metadata (think access controls, roles. etc). and simple ordering. Ideally, a user could share a Box/Dropbox/etc folder with the repository, go to a web interface, select their SIP/bag, and press "GO!"