Swift uses the concept of “large objects” and implements segmenting in order to more efficiently replicate larger files through the storage network. Any object 5GB or up (unless the default is set differently by the administrator of the network) must be segmented into smaller pieces. The size of these segments is set by the user or by the application interacting with Swift. The Swift documentation includes an explanation of how large objects are treated, including setting segmenting values for commands and making use of manifests so that segments for large objects can be rejoined on download.
Segmenting was not implemented when the Swift access protocol was developed for the Storage Service. This means that no object larger than 5GB can either be retrieved or deposited when using Archivematica in conjunction with a Swift-based storage network set up with the default 5GB object limit.
When using Swift as an access protocol, failures will occur in relation to the following locations:
Transfer source: a standard transfer directory cannot have any objects above 5GB; zipped transfers cannot be larger than 5GB in total. In both cases the transfer will fail to initiate.
Backlog: no object can be above 5GB.
AIP store: No object when uncompressed, or the total transfer when compressed, can be above 5GB.
DIP store: No object in the DIP above 5GB.
Typical entries in the storage_service.log file are 413 Request Entity Too Large Your request is too large., and a MemoryError in the case of initiating a transfer. It seems like memory errors are from the request timing out because the object cannot be delivered to the file system.
Steps to reproduce
Start and/or store a standard or uncompressed transfer with an object above 5GB; or a zipped/compressed transfer larger than 5GB
Your environment (version of Archivematica, operating system, other relevant details)
1.11.2, CentOS
For Artefactual use:
Before you close this issue, you must check off the following:
[ ] All pull requests related to this issue are properly linked
[ ] All pull requests related to this issue have been merged
[ ] A testing plan for this issue has been implemented and passed (testing plan information should be included in the issue body or comments)
[ ] Documentation regarding this issue has been written and merged (if applicable)
[ ] Details about this issue have been added to the release notes (if applicable)
Expected behaviour
Swift uses the concept of “large objects” and implements segmenting in order to more efficiently replicate larger files through the storage network. Any object 5GB or up (unless the default is set differently by the administrator of the network) must be segmented into smaller pieces. The size of these segments is set by the user or by the application interacting with Swift. The Swift documentation includes an explanation of how large objects are treated, including setting segmenting values for commands and making use of manifests so that segments for large objects can be rejoined on download.
Segmenting was not implemented when the Swift access protocol was developed for the Storage Service. This means that no object larger than 5GB can either be retrieved or deposited when using Archivematica in conjunction with a Swift-based storage network set up with the default 5GB object limit.
Part of the reason may be that the python swiftclient API in use by the Storage Service script was not built to handle segments. Alternatively, the SwiftService API does offer options for segmenting options.
Current behaviour
When using Swift as an access protocol, failures will occur in relation to the following locations:
Typical entries in the storage_service.log file are
413 Request Entity Too Large Your request is too large.
, and aMemoryError
in the case of initiating a transfer. It seems like memory errors are from the request timing out because the object cannot be delivered to the file system.Steps to reproduce
Start and/or store a standard or uncompressed transfer with an object above 5GB; or a zipped/compressed transfer larger than 5GB
Your environment (version of Archivematica, operating system, other relevant details)
1.11.2, CentOS
For Artefactual use:
Before you close this issue, you must check off the following: