Several requests have come in to improve the metadata handling, and I think they are closely related and should be addressed together:
[x] Allow a client to specify the content-disposition download file name to override the default which is based on object name.
[x] Add support for other checksums, e.g. SHA-256 which we might map to a custom Content-SHA256 header.
[x] Allow changes to object metadata via subsequent requests. The object-version content remains immutable, but some metadata could be changed to improve presentation of existing objects.
Content-Type
Content-Disposition (limited by a hard-coded regexp to only allow basic filename suggestions)
Content-MD5 (set once, then immutable)
Content-SHA256 (set once, then immutable)
Considerations
We continue to only support metadata which maps to HTTP headers for object content
For changes to metadata, we need a different URL that maps to just the target metadata as a sub-resource.
We should consider carefully what control over Content-Disposition is reasonable. We might limit it to just an alternate filename that matches certain regular expressions?
We might introduce new headers like Content-SHA256 to complement Content-MD5?
We should only allow a missing checksum to be applied. It makes no sense to change an existing checksum since we are using them during upload for integrity checks, so a change would be from valid to invalid checksum.
Implementation Sketch
Add a new sub-resource space metadata alongside existing acl space, e.g.
Change object store schema to add metadata column which stores the flexible set of metadata and absorb the previously separate content_type and content_md5 columns.
Change the code to be more generic and consult a configuration map for which header names are supported and what kind of special handling they require, e.g. regexp validation and comparison of supplied checksums to those computed over stored object data.
Questions
Should the set of supported metadata headers be hard-coded or managed by hatrac_config.json and customized by each deployment? Currently hard-coded.
Should we allow hash verification to be enabled or disabled? In global config? On per-request basis? Currently hard-coded with existing MD5 support and no SHA256 verification.
Summary
Several requests have come in to improve the metadata handling, and I think they are closely related and should be addressed together:
Content-SHA256
header.Considerations
Content-Disposition
is reasonable. We might limit it to just an alternate filename that matches certain regular expressions?Content-SHA256
to complementContent-MD5
?Implementation Sketch
metadata
alongside existingacl
space, e.g./foo:XYZ;metadata/content-md5
: string content/foo:XYZ;metadata/content-type
: string content/foo:XYZ;metadata
: JSON document{"content-md5": ..., "content-type": ..., ...}
metadata
column which stores the flexible set of metadata and absorb the previously separate content_type and content_md5 columns.Questions
@hongsudt @mikedarcy @robes @ljpearlman