dotmesh-io / dotmesh

dotmesh (dm) is like git for your data volumes (databases, files etc) in Docker and Kubernetes
https://dotmesh.com
Apache License 2.0
539 stars 29 forks source link

key=value metadata for dots #503

Open alaric-dotmesh opened 6 years ago

alaric-dotmesh commented 6 years ago

Being able to store key=value metadata for dots will be useful for managing large numbers of dots; people will be able to search for them, and we will be able to use metadata to enhance the user interface in countless ways.

In particular, dot metadata will be useful for Dotscience - attaching dots to "projects", marking dots as datasets or workspaces, etc.

So, we need a facility whereby we can:

  1. Assign key=value metadata to dots (allowing for duplicate keys, as it's more general, and makes it easy to bind a dataset dot to multiple projects by duplicating the dotscience-project=... key), through a SetDotMetadata API call
  2. Return current dot metadata in the results of Get, List, etc.
  3. Reflect changes to dot metadata in the commit history of a dot, suggesting that metadata changes should be treated as a form of commit. There's precedent for this in the automatic generation of commits for S3 metadata inclusion, handoff, etc.
  4. Offer an interface (backwards-compatible extension to List/AllDotsAndBranches?) to return dots matching a metadata constraint (there's libs to implement k8s-style selectors in go we could use as a query language)
lukemarsden commented 6 years ago

We have this on the dotscience side as DotRef Metadata in postgres, but there may be good reasons to put the metadata in the dots themselves instead...