Closed zimeon closed 6 years ago
At Cornell we are doing video digitization work that has so far created files up to 650GB, and we anticipate the possibility of larger files.
On current unix filesystems multi-TB files are supported although somewhat unwieldy. On AWS S3 there is support for individual files up to 5TB though transfer requires the use of multi-part upload in chunks <= 5GB (my experience from a few years ago suggests that chunks somewhat smaller than 5GB would likely be better for internet-scale transfers).
Is the issue here per-file filesystem limits?
In general I think multipart files might be in scope, but I would like to see an additional use case or details for them.
F2F 2018.09.05: Chunking is always possible, but OCFL will not specify any way of chunking files.
Some institutions may have very large files that are inconvenient or impossible to store as single files within and OCFL digital object. It would always be possible to split files into multiple parts in a way that each part if treated as a first class file by OCFL, but that pushes the modeling/support burden onto the application. However, an OCFL model/convention for multipart files would allow the development of shared tooling to handle large files.