Open jcooklin opened 8 years ago
This looks good, but I think we should break out the encryption piece. In my mind, there are varying degrees of complexity on how this should work. This variance is dependent on how we manage the remote communication between nodes.
For example, I don't believe it to be safe to have a symmetric key sitting on the file system, if we are going to go the payload-route, i think it will require some 2-step handshaking, not unlike how snapd handles encryption between itself and plugins.
However, if these calls are going to be http based, we could just go the transport-route and use the existing https code in the API.
Tribe should/could also support high availability as suggested in #773.
Very interesting. However some questions come to my mind:
Thanks
Good questions Olivier :)
Another question from me, related to the dynamic addition of nodes to tribes through policies. I am wondering if we could go one step further and actually dynamically create the tribe if doesn't exist. E.g. snapd is started on a node, the tribe policy adds it to tribe "foo", tribe "foo" is not known so it is created (and gossip will take care of making the other nodes aware of that tribe). In this way, the initial step of pre-creating the tribes would be unecessary. I think have a use-case where this functionality could be very useful. Would that make some sense?
Thanks.
This spec proposes the following features and enhancements to tribe.
Subtribes
To improve the operational experience what is currently known as an ‘agreement’ will be replaced with the term ‘tribe’ and actions that affect a tribe will be made explicit. Let’s start with an example.
Instead of creating a named agreement we will create a named tribe and and join all the members to it.
Let’s imagine that i1 and i2 are somehow special and should have additional plugins loaded and tasks running beyond what is defined by the core tribe.
On our core tribe we will load an influxdb and psutil plugin and start a task capturing basic OS utilization details. On our storage tribe we will load the smart plugin and start a task capturing disk IO.
Explicitly referring to the tribe when loading plugins or tasks reduces the risk that a user accidentally affects the entire tribe when they perform actions that are intended for an individual node. It also more effectively supports multiple potentially overlapping tribes.
Dynamically adding nodes to tribes through policies
When started in tribe mode, snap will establish a list of facts collected from the node it is running on as well as arbitrary key/value pairs provided on startup. These facts will then be used to evaluate tribe policies. When a policy is evaluated positively it will result in the node being added to a tribe.
Example facts: architecture, default_ipv4, default_ipv6, devices, os_dist, os_dist_release, os_dist_version, processor_type, processor_features, memtotal,…
Adding a policy:
When snap is started in tribe mode on an Ubuntu host with the policy above configured, it will automatically join the core tribe. If it has the fact
storage_tier=True
it will also be added to the storage tribe.Other affected components
facts
.Process and publish through remote nodes (calling remote plugins)
Tribe enables the ability to reference a named tribe in the process and/or publish portion of a task definition. When a plugin is loaded on a node that is associated to a tribe, it will share the connection details to the global tribe as part of its metadata. This enables each snap node in the tribe to call remote plugins.
Other affected components
Encrypt tribe messages
Protect tribe communication by supporting symmetric key encryption.
snapd keygen
)