LibraryOfCongress / bagit-python

Work with BagIt packages from Python.
http://libraryofcongress.github.io/bagit-python
219 stars 85 forks source link

split bag by size #58

Open johnscancella opened 8 years ago

johnscancella commented 8 years ago

add split bag by size like the java version of the library supports as requested by https://github.com/LibraryOfCongress/bagit-java/issues/47

ntallman commented 7 years ago

Bump to this issue. It is becoming critical for my institution to be able to split bags. Since the CLI was removed from bagit-java, the only way to split bags is via a Java script. My institution is not a Java shop. (I tried to use the CLI in bagit-java 4.12 but get an error I do not now how to resolve as I know nothing about writing or editing Java code.)

ntallman commented 7 years ago

Bump. There is no way to split bags without building a Java app. This is a serious gap in the bagit-python.

acdha commented 6 years ago

@ntallman How do you use this — when bagging, updating existing bags, etc.? Since this is something of an edge case I'm wondering whether it should be a separate utility.

ntallman commented 6 years ago

@acdha It's less critical to me now, but there's still a need. APTrust initially had a 250 GB bag size limit, but it's been increased to 5 TB. However, MetaArchive has a 30-GB bag size limit. Members used to use a script that leverage the bagit-java CLIl to split bags, but since the CLI was removed, there's been no way to do this without writing java scripts. I know a lot of places use bagit-python, it would be great if the features had parity with what's possible using bagit-java.