LibraryOfCongress / bagit-java

Java library to support the BagIt specification.
Other
73 stars 49 forks source link

BagIt File Packaging Format v1.0 in bagit-java #116

Closed rvanheest closed 6 years ago

rvanheest commented 6 years ago

A couple weeks ago, the BagIt spec was updates from v14 to v15 and shortly after that to v16. The major difference is that the BagIt File Packaging Format went from v0.97 to v1.0, and with that a couple parts of the spec changed.

First question is whether there are any plans for these changes to be reflected in the bagit-java library. While looking through the code on the current master branch, I wasn't able to spot any changes yet (except for one instance). Also BagCreator tells me that v0.97 is still the latest version.

Between v14 and v15, I found a change in the way fetch files are described. Since BagIt v1.0 it says that "Every file listed in the fetch file MUST be listed in every payload manifest.". Although I think this is a good change (as this was not explicitly described before), it is not at all reflected by this BagitJava library. The first question to be asked is whether this should be supported. Probably not in the creation/modification of the Bag object, but perhaps this should be validated in the BagVerifier or BagLinter.

Another difference between v14 and v15 is the Tag Manifest, which now says: "each tag manifest SHOULD list the same set of files.". Are there plans to incorporate these and other changes into the library?

Although I wasn't able to find references to v1.0, I was able to spot some references to new Version(2, 0), also sometimes refered to as DOT_BAGIT_VERSION or annotated with @Incubating. Is there any spec of this available, like for the current versions, or is this just an experiment of yours?

johnscancella commented 6 years ago

Hi @rvanheest

Most of the changes to the spec were driven in part due to the discrepancies I found while re-writing bagit-java. As you noted we do need to update a few places with the new wording for the 1.0 spec.

DOT_BAGIT_VERSION is our name for the next (breaking changes) version of bagit and while we have talked about it here at the library of congress internally there is no official spec(I would keep an eye on ietf website). @incubating is a annotation I created to explain that it is a experimental feature and should not be relied on at this point in time.

You are most welcome to send a pull request with any changes you think need to be made, otherwise I (or another colleague) will update this library with the changes from the 1.0 spec.

johnscancella commented 6 years ago

See PR https://github.com/LibraryOfCongress/bagit-java/pull/118

johnscancella commented 6 years ago

This is now in release 5.2.0