denismakogon commented 6 years ago

Goal: Extend Fn resources to support attached key/value metadata set by API users or set and controlled by extensions.

(@zootalures's edits) (@zootalures - renamed "metadata" to "annotations")

App and Route Annotations

App and Route annotations allow users building on the Fn platform to encode consumer-specific data inline with Apps and Routes.

Annotations can be used to either communicate and carry information externally (only by API users) or to communicate between external applications and Fn extensions.

Use cases/Examples:

Externally defined/consumed annotations

Software using Fn as a service, attaches non-identifying metadata annotatinos to Fn resources for subsequent reads (e.g. my reference for this function/app )

Writer : API user, Reader: API user

E.g. platform "platX" creates/modifies functions, has an internal reference it wants to associate with an object for later retrieval (it can't query/search Fn by this attribute)

POST /v1/apps

 {
 ...
  "annotations" : {
    "platx.com/ref" : "adsfasdfads"
  }
 ...
 }

Extensions: Allow passing data from user or upstream input to extensions (configuration of extensions)

Writer : API user, Reader: Fn platform extension (use), API user (informational)

POST /v1/apps
...
   {
     ...
       "annotations" :  {
              "my_cloud_provider.com/network_id" : "network.id"
       }
     ...
}

Extensions: Allow indicating internally derived/set values to user (API extension sets/generates annotations, prevents user from changing it)

Writer : Internal platform extension, Reader: API user.

GET /v1/apps/myapp
...
   {
     ...
       "annotations" :  {
              "my_cloud_provider.com/create_user" : "foo@foo.com"
       }
     ...
}

PATCH /v1/apps/myapp
...
   {
     ...
       "annotations" :  {
              "my_cloud_provider.com/create_user" : "foo@foo.com"
       }
     ...
}

HTTP/1.1  400 Invalid operation

{
   "error": "annotation key cannot be changed",
}

Content Examples

example : user attaches local annotations

PUT /v1/apps/foo

{
  app: {
   ...
    "annotations": {
         "mylabel": "super-cool-fn",
         "myMetaData": {
           "k1": "foo",
           "number": 5000
           "array" : [1,2,3]
         }
     }
  ...
  }
}

User sets extension-specific annotations:

PUT /v1/apps/foo
{
   ...
    "annotations": {
        "example.extension.com/v1/myval" : "val"
    }
  ...
}

Key Syntax and Namespaces

A key consists of any printable (non-extended) ascii characters excluding whitespace characters.

The maximum (byte) size of a key is 128 bytes (excluding quotes).

Keys are stored as free text with the object. Normatively extensions and systems using annotations must use a namespace prefix based on an identified domain and followed by at least one '/' character.

Systems should use independent annotation keys for any value that can be changed independently.

Extensions should not interact with annotations keys that are not prefixed with a domain they own.

Value syntax

Values may contain any valid JSON value (object/array/string/number) except the empty string "" and null

The serialised JSON representation (rendered without excess whitespace as a string) of a single value must not exceed a 512 bytes.

Modifying and deleting annotation keys

A key can be modified by a PATCH operation containing a partial annotations object indicating the keys to update (or delete)

A key can be deleted by a PATCH operation by setting its value to an empty string.

For each element that of data that can be changed independently, you should use a new top-level annotation key.

Maximum number of keys

A user may not add keys in a PATCH or PUT operation if the total number of keys after changes exceeds 100 keys.

Fn may return a larger number of keys.

Extension interaction with resource modification

An extension may prevent a PUT,PATCH or POST operation on a domain object based on the value of an annotation passed in by a user, in this case this should result in an HTTP 400 error with an informational message indicating that an error was present in the annotations and containing the exact key or keys which caused the error.

Datastore impact

Type: text Nullable by default

The maximum stored size for a given object (when stored serialized as a single JSON object) is : 2 + (max_internal_keys * (6 + max_key_length + max_value_length)) or about 74k.

zootalures commented 6 years ago

General Questions for me:

Naming: is "metadata" deifinitely right
Values: using JSON objects as values - this seems convenient but might be a pain in SDKs

rdallman commented 6 years ago

A key can be deleted by a PATCH operation by setting its value to an empty string.

for any type, correct? number/object/list/string ?

Values: using JSON objects as values - this seems convenient but might be a pain in SDKs

only concern here is if we want querying based on metadata it isn't easy to do this (afaik) on many datastores if it's a json blob in there. I suppose this could be used as a general tagging system as well and users may want tag-based querying. if tags are orthogonal and something else, tell me to go away. other than that, it LGTM

zootalures commented 6 years ago

Will clarify PATCH behaviour - I think empty string for all is what I had in mind (!?!) which does seem a bit odd - any ideas?

Also will make it clear that PATCH only applies to top-level values. (not modifying content of metadata objects)

Re querying : Yeah - good point : was assuming no extra features here (no querying on tags)

For the service use case where metadata might be identifying I think I can work around that elsewhere

Thinking about options for querying in general, seems like a useful feature, but I think it would have to be opt-in on specific metadata tags based on extension (i.e. for a given install, some tags are queryable and others arent)

i.e. Tagomatic extension indexes x.tagomatic.com/tags/*

then query model tolerates an option like : metadata:{"x.tagomatic.com/tags": ["foo","bar"]} and tagomatic does something to make that filter work against that index.

That seems profoundly over-abstracted in other ways though.

sachin-pikle commented 6 years ago

Trying to understand the context of this requirement (not the implementation). If this is not relevant to the discussion, please ignore.

What is a real-life example of where object metadata should be used?
Is it for configuration settings? For example if my function needs to talk to a backend DB instance (host name, user, encrypted pwd, etc.), or my function needs to talk to OAuth (domain, secret key, URLs, etc.), or some other backend service settings, etc.?
Is it for passing in environment-specific settings to my function so it can behave/be configured differently in different environments (dev, test, prod)?
Is it for cloud provider-specific information that OSS Fn won't directly have knowledge of (different for Oracle Cloud vs Google Cloud vs AWS)?
Is it for custom tags e.g. app=ecommerce, env=prod, cost-center=BCQ1, etc. ?
What else?

denismakogon commented 6 years ago

On Mon, Mar 12, 2018 at 7:50 AM Sachin Pikle notifications@github.com wrote:

Trying to understand the context of this requirement (not the implementation). If this is not relevant to the discussion, please ignore.

What is a real-life example of where object metadata should be used?

Extension may use the metadata as the simplest way to store extensions-specific configurations. Consider that someone wants to implement Kafka event-based trigger, so, there should be some kind of a store where I can put Kafka credentials to let my extension reach it out and function’s metadata is the perfect place to put such configuration.

Is it for configuration settings? For example if my function needs to talk to a backend DB instance (host name, user, encrypted pwd, etc.), or my function needs to talk to OAuth (domain, secret key, URLs, etc.), or some other backend service settings, etc.?

Function by itself doesn’t have an access to its metadata. If your function needs to talk to the database - use an app/route config.

Is it for passing in environment-specific settings to my function so it can behave/be configured differently in different environments (dev, test, prod)?

function has nothing to do with its metadata

It for cloud provider-specific information that OSS Fn won't directly have knowledge of (different for Oracle Cloud vs Google Cloud vs AWS)?

it could be whatever you want, as I mentioned before, metadata could be a very useful for extensions, maybe, in a multi-cloud deployment, metadata could be a compute scheduling hints.

What else?

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/fnproject/fn/issues/789#issuecomment-372200421, or mute the thread https://github.com/notifications/unsubscribe-auth/AC5L6wNQfCUcm0edBOxQ4-isYFCAgKOOks5tdgy4gaJpZM4SPSwZ .

zootalures commented 6 years ago

Thanks @denismakogon , will try and clarify in the issue content.

zootalures commented 6 years ago

I would like to consider tagging/tags independently - as they are directly implicated in search wheras metadata does not need to be. - Tagging may be a first class thing on objects.

rdallman commented 6 years ago

one thing that seems confusing:

config and metadata are effectively the same thing, but appear to be intended for slightly different end-users, neither of which imply exactly for whom this may be from their title alone. Aside from the concern of this not being very user friendly (which may be easy to ameliorate), is this correct?

config is for user's function code to get configuration info
metadata is for fn to get configuration about an app/route (say, a namespace id or some instance credentials)

how can we best delineate these two? is there a reason that the latter information should be omitted from the former (I think so? but not sure) or is this simply an issue that we want more robust (json object) than a key-value string pair?

the most recent comment also implies adding tagging separately, which means we have config, metadata and tags. perhaps this is ok, but if I'm a brand new user it's not immediately obvious what the crossover between these things are. maybe we need all 3, maybe just the names are easy to confuse what is what, it would of course be nice to have as few things as possible if at all possible. fwiw, lambda has tags and environment (==config), both exposed/set by user, but they may also internally have metadata (we may never know).

zootalures commented 6 years ago

config and metadata are effectively the same thing, but appear to be intended for slightly different end-users, neither of which imply exactly for whom this may be from their title alone. Aside from the concern of this not being very user friendly (which may be easy to ameliorate), is this correct?

That is correct (although metadata also includes possibly user-immutable config/data generated by an extension that is passed back to the user (reference system information about the object))

how can we best delineate these two? is there a reason that the latter information should be omitted from the former (I think so? but not sure) or is this simply an issue that we want more robust (json object) than a key-value string pair?

I think they are independent at one level - certainly from a multi-environment perspective - config is relevant to all deploys of the same function in every environment (every function needs its config and the config is defined by the function image content/user code ) whereas the sort of vendor metadata we are looking at would be environment specific (defined by the configuration/extensions of the specific install that the user is targeting). Namespacing on metadata is a sleight of hand to ensure that user can define multiple configurations in a single place for different environments without those overlapping or causing problems.

Config forms an intrinsic part of the definition of a function in all cases, metadata forms part of the users definition of the function in a particular context. (perhaps "contextdata" or "vendorconfig" or "systemconfig" is are better names)

At the end of the day everything is just an attribute on the function object(s) - providing a split at the top-level do distinguish between these two cases is my preference as ultimately they are consumed independently and should be treated independently by a user.

the most recent comment also implies adding tagging separately, which means we have config, metadata and tags. perhaps this is ok, but if I'm a brand new user it's not immediately obvious what the crossover between these things are. maybe we need all 3, maybe just the names are easy to confuse what is what, it would of course be nice to have as few things as possible if at all possible. fwiw, lambda has tags and environment (==config), both exposed/set by user, but they may also internally have metadata (we may never know).

ack - I'd like to revisit tagging separately

lambda isn't an extensible platform so this makes me more comfortable about having it as a separate concern.

Once this is defined, consumers and generators will still need to be judicious about its use - I've tried to capture the system constraints here (numbers and sizes) but I'll also add a specific example

zootalures commented 6 years ago

I'll get started on a PR and we can come round again on that.

treeder commented 6 years ago

I assume this should be attached to #714 ?

zootalures commented 6 years ago

retconning the original issue to reflect renaming "metadata" to "annotations"

fnproject / fn

Introduce object annotations to app and route #789

App and Route Annotations

Use cases/Examples:

Externally defined/consumed annotations

Extensions: Allow passing data from user or upstream input to extensions (configuration of extensions)

Extensions: Allow indicating internally derived/set values to user (API extension sets/generates annotations, prevents user from changing it)

Content Examples

Key Syntax and Namespaces

Value syntax

Modifying and deleting annotation keys

Maximum number of keys

Extension interaction with resource modification

Datastore impact