project-receptor / python-receptor

Project Receptor is a flexible multi-service relayer with remote execution and orchestration capabilities linking controllers with executors across a mesh of nodes.
Other
32 stars 21 forks source link

RFE: Receptor node “relay security level” feature #110

Open dbaker-rh opened 4 years ago

dbaker-rh commented 4 years ago

This is a general request for enhancement within the receptor mesh to address the fundamental expectation of directionality of traffic in a traditional multi-layer network design.

Apologies in advance for the length of this message.

Situation being considered:

In a traditional on-prem/data center style network there’s frequently the simplification that traffic heading outwards is fine, traffic heading inwards is not. This is well explained at this reference, using Cisco ASA interface security levels as the example: https://geek-university.com/ccna-security/asa-security-levels-explained/

Note: one difference here is that interfaces in a Cisco ASA firewall are by necessity the boundaries BETWEEN security levels, and as that URL notes traffic does not flow by default between interfaces with the same security level. In our receptor mesh example this does not hold true, and nodes are contained INSIDE given security levels and nodes with the same security level must be able to relay traffic between themselves.

For this example use case with the expectation that traffic should freely flow outwards through network zones but not inwards, everything is thrown out of the window once receptor mesh is installed.

Let’s rework that in the form of a problem to solve:

Problem statement:

The concept behind the proposed enhancement is to grant every node a “relay security level” as an immutable element of their registration into the node. For the sake of this example I propose simply an integer in the range 0..255.

Because any node can perform any or all of the three roles, we take a very simply approach to security levels and state this: Any node will only receive a WORK ORDER message from another node if the sender has an equal, or higher, security level than their own.

In our simplest case we’ll see that if all nodes have the same security level, there is no change in behaviour. All nodes are fully trusted.

In a simple demonstration (the network diagram from the Cisco security level URL), we can assign level 100 to all nodes in a core network, and level 50 to all nodes in the DMZ. All nodes within the core network are mutually trusted. All nodes within the DMZ are mutually trusted. Any node in the DMZ can receive a work order from the core. No node in the core will receive a work order from the DMZ. Directionality is created. (see the notes at the end for the nuance that this must only to work orders; result messages MUST be permitted to flow in reverse).

Two additional, sample network scenarios are worth mentioning.

First, a network with two concentric circles of DMZ. Apply levels 100, 50, 25 and the directionality gets extended in the same sequence as one would expect. Second, a network with two independent DMZs. Apply levels 100, 50, 50. So long as there isn’t a direct connection between the two DMZs the only way to route a message between them is through the core, and is now impossible since it would require sending a message THROUGH a higher security level even through the final recipient is at the same level.

The result is better containment. Even if a DMZ is fully breached, and the receptor node is breached, the compromise is limited to the hosts within the perimeter of that security level. No other zones, not the core, not other elements. Separately interesting elements come into play such as whether I can now DoS the core by sending spurious result messages, but spoofed work orders can not be sent.

Additional thoughts:

One important difference between receptor mesh and firewalls is that in the Cisco ASA firewall example URL, the interfaces are by necessity the boundaries BETWEEN security levels. As that URL notes traffic does not flow by default between interfaces with the same security level. In our receptor mesh example this does not hold true, and nodes are contained INSIDE given security levels and nodes with the same security level must be able to relay traffic between themselves.

A special case of “security level 0” will add value. This is to say, “this node is not allowed to relay work requests to anyone”, or effectively to flag the node as entirely untrusted. This might apply for a hub and spoke model where only one node is present within an untrusted network. As soon as two nodes are present in that untrusted network, they will need a non-zero (but presumed small) integer so as to allow them to relay work between themselves.

This control over relaying messages must only apply to WORK ORDERS. Any results that are to be returned to the sender must be accepted. It would be prudent to be careful in the processing of result messages as they may have originated in a hostile network zone. At the least, avoidance of any type of buffer overrun or data mishandling; confirmation that the result correlates with a valid and recent work order; etc.

A different use-case of a hub and spoke network with a single bastion node and multiple independent cloud virtual networks would also benefit from the same design. A single, centralized, and secured bastion node(s) can send messages out along any spoke, but there’s no call for spokes to send messages between themselves.

Where/how to embed this security level is left as an exercise to the reader. Perhaps embedded as a data element in the mTLS certs that it presents to the network would be one place to consider.

In consideration of the DMZ model we should also consider directionality of registration to the node. In a worst case situation we cannot permit a DMZ node to become broached, and merely re-register itself with the mesh at a higher security level. This, too, is left as an exercise to the reader.