informatics-isi-edu / hatrac

Simple object storage for collaborations
Apache License 2.0
3 stars 1 forks source link

Support management of object metadata #33

Closed karlcz closed 7 years ago

karlcz commented 7 years ago

Summary

Several requests have come in to improve the metadata handling, and I think they are closely related and should be addressed together:

Considerations

  1. We continue to only support metadata which maps to HTTP headers for object content
  2. For changes to metadata, we need a different URL that maps to just the target metadata as a sub-resource.
  3. We should consider carefully what control over Content-Disposition is reasonable. We might limit it to just an alternate filename that matches certain regular expressions?
  4. We might introduce new headers like Content-SHA256 to complement Content-MD5?
  5. We should only allow a missing checksum to be applied. It makes no sense to change an existing checksum since we are using them during upload for integrity checks, so a change would be from valid to invalid checksum.

Implementation Sketch

  1. Add a new sub-resource space metadata alongside existing acl space, e.g.
    • /foo:XYZ;metadata/content-md5: string content
    • /foo:XYZ;metadata/content-type: string content
    • /foo:XYZ;metadata: JSON document {"content-md5": ..., "content-type": ..., ...}
  2. Change object store schema to add metadata column which stores the flexible set of metadata and absorb the previously separate content_type and content_md5 columns.
  3. Change the code to be more generic and consult a configuration map for which header names are supported and what kind of special handling they require, e.g. regexp validation and comparison of supplied checksums to those computed over stored object data.

Questions

  1. Should the set of supported metadata headers be hard-coded or managed by hatrac_config.json and customized by each deployment? Currently hard-coded.
  2. Should we allow hash verification to be enabled or disabled? In global config? On per-request basis? Currently hard-coded with existing MD5 support and no SHA256 verification.

@hongsudt @mikedarcy @robes @ljpearlman

karlcz commented 7 years ago

This is now in a pull-request in hatrac awaiting review...

karlcz commented 7 years ago

This has been merged to maser