OpenFn / kit

The bits & pieces that make OpenFn work. (diagrammer, cli, compiler, runtime, runtime manager, logger, etc.)
11 stars 10 forks source link

engine: can we emit events straight to the pheonix socket? #544

Open josephjclark opened 11 months ago

josephjclark commented 11 months ago

Something is bothering me.

When the runtime emits an object - a log or a run result - that object gets serialized many times:

As a result of this there's also quite an ugly chain of event mappings between the runtime and the worker. And I think if we have a worker_thread -> child_process -> engine main -> worker architecture, as I'm planning to introduce, the amount of serialisation goes up.

But if the engine connected to the socket directly, we could do less serialisation and less conversion.

Now, there are problems with this. It's a major blurring of the engine and the worker - they both do the same thing, and in effect the engine is coupled to lightning.

The worker is supposed to just be a lightning interface layer, and the engine is supposed to be a generic, long running, multi-threaded (whatever that means) wrapper around the actual runtime.

That's a nice architecture really with a strong separation of concerns. But it may be a little bit too stretched thin, and the cost of serialisation may be too high.

Maybe a better approach is:

The worker needs to track the life cycle of the attempt, but it doesn't really need to know all the state objects and stuff. Even better if no state gets load into any shared memory at all. So you have like a lightweight eventing layer which doesn't send any state objects (or log messages) - basically a blind layer which sees events but not their payload (also ideal for tracing and external debugging!) - and a deeper layer which sends full payloads out to lightning.

It's a big change but food for thought.

josephjclark commented 9 months ago

Stu very wisely suggest that the innermost worker should send the payload as a string, so that it isn't constantly serialised and deserialised

josephjclark commented 9 months ago

In order for this to work, the inner thread will create its own socket to lightning. That itself is fine, but you'd need to send the worker token down into the thread as well as the attempt token in order to connect to the socket. Now there are two JWTs in the sandbox environment.

We should be able to secure those jwts so that a breakout can't get them. But it is nice to be able to say "the worker environment is totally clean and there's nothing sensitive in it"

josephjclark commented 1 month ago

This is back in contention again because we THINK that the main worker thread is being bombarded with socket messages and creating a bottleneck. As capacity increases, more and more messages are being processed by the main worker thread, slowing everything down.

The idea of having the child process speak directly to lightning feels like it'll add huge peformance gains.

But the architecture worries me. The engine is supposed to be generic but now we're asking it to know about lightning. We're blurring the line between the worker and the engine.

Can the worker inject plugins somehow? Can the worker push code to hook to events inside the process? That's all we're talking about - we don't want to change the engine's behaviour, we just want to hook to events inside the engine.