bluejekyll / vermilionrc

A safe and reliable process manager
Other
40 stars 1 forks source link

IPC or Event Bus #5

Open bluejekyll opened 4 years ago

bluejekyll commented 4 years ago

See https://github.com/bluejekyll/vermilion/issues/2#issuecomment-554667565 for application architecture:

If Vermilion were a military organization, then the IPC system it uses would be person the General trusts to take the orders as written and deliver to the field commanders, without changing or influencing them in any way. In that regard, the IPC should have no roll in potentially influencing the outcome of decisions or have the potential of causing decisions that were not properly originated to be carried out.

It's currently not clear if this project should built it's own Event Bus or not, we don't necessarily want to take over for what there might be a better option provided by the OS, that is TBD.

Requirements:

Questions:

Notes:

hawkw commented 4 years ago
  • What should be the underlying protocol be, should we consider HTTP2 or quic? vs. named pipes, etc.

If Vermilion expects to run only on POSIX-compliant systems, using Unix domain sockets is probably a good option.

A lot of the features in networking protocols like HTTP2 or QUIC may not be strictly necessary for IPC, if constrained to IPC only within one physical computer (i.e., not across a "real" network); things like preventing out of order delivery, TCP congestion control, etc are just overhead if there isn't a real network link involved.

tarcieri commented 4 years ago

If you follow a proper Unix process hierarchy as your supervision architecture, all communication can occur over pipes or socketpairs (which are, I believe, technically AF_UNIX) with no need to involve the filesystem.

bluejekyll commented 4 years ago

Ok. I’ll need to investigate a concrete plan for this.

tarcieri commented 4 years ago

@bluejekyll every parent process has a secure channel to a child process via stdin/stdout/stderr, however these are generally used for things like logging (ala something like systemd-journald).

But, the parent/supervisor can also create a separate pipe() pipe pair or socketpair() to share with the child prior to forking.

bluejekyll commented 4 years ago

I think I understand where your going. It differs a little from how I was thinking of this. Is your idea that we could construct that graph from the first fork all the way down? I hadn’t been thinking of it in that way...

tarcieri commented 4 years ago

yes, and then the highest authority process can manage the capability lists (a.k.a. CLists) for each subordinate, and mediate the authority they have between each other (effectively being the ultimate arbiter of authority for the system), kind of like the seL4 kernel. In seL4, it's pretty much the kernel's one job.

bluejekyll commented 4 years ago

Not having worked with seL4, is it accurate that it's message size is 115 bits?

Would we want to restrict ourselves to this limit to potentially leverage seL4's internal IPC?

tarcieri commented 4 years ago

@bluejekyll I was using seL4 as an example / food for thought. It is, of course, a kernel, not a supervisor (among many other differences), but there are some ideas about how it organizes authority I really like.

I think the key takeaways for building an init system with similar concepts:

Minimal code size high authority process acting as the broker between all subprocesses/subcomponents and acting as the enforcer for OCap security.

For example, this process can maintain a list of all of the OCaps granted to subprocesses (known as a CList in OCap parlance). This means that no subprocess can use or forge a capability that isn't in their CList. To me this is preferential to trying to use cryptography to ensure the authenticity of OCaps: in addition to all the worries about the ways cryptography can go wrong, it also means that OCap delegation happens explicitly as part of a secure protocol, and therefore OCaps can't be leaked as sidechannels.

It also means OCaps can be stateful, which generally makes them more flexible than OCaps represented using cryptography alone.

CameronNemo commented 4 years ago

Should we look at using protobuf for all messaging?

I would recommend a zero copy wire format, like capnproto, flatbuffers, or SBE.

What should be the underlying protocol be, should we consider HTTP2 or quic? vs. named pipes, etc.

I would suggest using an AF_UNIX socket if the communication will be local-only. The question is whether you want to use SOCK_STREAM, SOCK_DGRAM, or SOCK_SEQPACKET. With AF_UNIX, datagrams are actually reliable. I think the main difference between DGRAM and SEQPACKET is that the former is connectionless while the latter forms a sequenced two way connection. So if replies are expected, SEQPACKET might make more sense. But if an endpoint's purpose is to just beam lots of events asynchronously, then DGRAM would be fine.

I currently maintain a fork of Upstart, so if you want my thoughts on the trade-offs of different event bus architectures or the FSM for the job/process lifecycle, feel free to prod me.

bluejekyll commented 4 years ago

Thanks for the feedback. I really appreciate it and would love your perspective on how we should move this forward. I haven’t had enough time to work on this recently, but I do have a branch where I’ve been experimenting with different options.

I’ve been playing with SOCK_STREAM and passing connections directly through process ownership. This is under documented and under tested, but shows what I’ve been building towards: https://github.com/bluejekyll/vermilionrc/tree/experiment-with-forking?files=1

tarcieri commented 4 years ago

I would recommend a zero copy wire format, like capnproto, flatbuffers, or SBE.

I'm not sure the tradeoffs of these formats actually make sense in this context.

They all provide micro-optimizations for decoding, but at the cost of added complexity, and sometimes security. The flatbuffers crate presently has known soundness issues, for example.

I'd suggest a simple encoding which doesn't have known unsafety/soundness issues, and in general that security should take precedence over micro-optimizations.