bigtestjs / server

All BigTest development has moved to https://github.com/thefrontside/bigtest
https://github.com/thefrontside/bigtest
2 stars 1 forks source link

Create an abstraction for processes which may take some time to get started #25

Closed jnicklas closed 4 years ago

jnicklas commented 4 years ago

I actually went for a more "evented" implementation at first, but backtracked from that for a few reasons.

One thing to note regarding state is that while we do get some state management for free with Effection, there is currently no notion of difference between a process which is starting and which is running, which is what we really need. I'm not sure how to express that purely with forks either, since as you pointed out, splitting the main forked process into two isn't really an option, since that will prevent proper resource cleanup.

My first implementation actually was evented in the sense that Process inherited from EventEmitter (it still does, though it's currently unused), when the process was ready, we'd emit a "ready" event. We'd then attach an event handler like this:

let server = new SomeServer()
fork(function*() {
  yield on(server, "ready");
  console.log("ready!");
});

This works fine in the current implementation, because fork is somewhat synchronous, but we already know that we may need to change this. If we do, then we have a pretty evil bug in this system, where it could potentially deadlock if the ready event is emitted before we attach the listener.

To work around this, I first created a promise in the constructor which attached to the ready event, before changing it to the current solution. I'm still torn on whether that solution is better than the current code or not.

I think an interesting longer-term solution could be to actually add this sort of mechanism to Effection itself, since it's presumably pretty common to have processes which have some kind of asynchronous start up (and maybe clean up, as well?)

One benefit of the design that I've chose here, which I didn't mention and which may not be super apparent at the moment, is that it's working towards future changes which we'll likely need. Mainly being able to treat the various servers as objects and event sources. This will allow coordination between the various parts of the system. My thinking here is that the orchestrator will pass itself to all the children it starts, and then we could do something like this:

// in command server
this.orchestrator.runTest(new TestRun("./test/something.test.js"));

// in orchestrator
runTest(testRun: TestRun) {
  this.testRuns.push(testRun);
  this.emit("test:start", testRun);
}

// in connection server
*run() {
  forkOnEvent(this.orchestrator, "test:start", function*(testRun) {
    send({ type: "test:start", testRun });
  });
}

This is all obviously total pseudo-code, but you get the idea.

cowboyd commented 4 years ago

I share your concern about race conditions, especially as it pertains to event emissions. It's actually one of the reasons I'm not so keen on event APIs in the first place: If you happen to "miss" the event because you weren't listening at the precise moment that it was emitted, then that's it, you're out of luck. It's gone and you never have a chance to catch that train again. In fact, it's precisely this uncertainty that has prevented me from ever (yet) choosing NodeJS over any other backend technology :) So far, we've had (I think) a lot of success by eliminating the evented nature of the code, and it's my sincere hope that we can keep it that way.

The difference in what I'm proposing is that while there technically is a change event, what I left unstated is that it would be the only true event related to state ever. And that one true event would contain the complete state, in it's entirety, every time. All other events, and all other code executed would fundamentally be nothing more than derivations from that single event.

Where I'm imagining us getting to eventually is more akin to pattern matching like in Elixir and Haskell than something truly evented. E.g.

when(state, current => current.isActivating, function*() {
  console.log('Effect is currently activating');
});
when(state, current => current.isActive, function*() {
  console.log('Effect is active!');
})

and the moment that something no longer matches, then the enclosed process is halted immediately:

when(state, current => current.isActivating, function* showSpinner() {
   while (true) {
     yield rewriteConsoleLine('activating');
     yield timeout(400);
     yield rewriteConsoleLine('activating.');
     yield timeout(400);
     yield rewriteConsoleLine('activating..');
     yield timeout(400);
     yield rewriteConsoleLine('activating...');
     yield timeout(400);
   }
});

This is much more declarative in nature. Essentially we've erected a "guard" around a process that is completely devoid of any direct event handling or operation handling.

That's not to say that you can't also have events, but they would be emitted strictly by matching against a concrete state. Hand wavy syntax:

state.deriveEvent('active', current => current.isActive);
state.deriveEvent('childrenAdded', (current, previous) => {
  return current.children.length > previous.children.length;
})
state.deriveEvent('childrenRemoved', (current, previous) => {
  return current.children.length < previous.children.length;
})

I think we'd want to use these events to build declarative APIs like the one I'm suggesting above.

This approach has some key advantages:

Never miss an event

Because events are completely synthetic and derived off of the current state, it doesn't matter when you start listening. It could be at the 1st state, or the 10th, or the millionth, state. The code you need to run in reaction to it will still run.

This is really nice when you want to have persistent state in say a database and boot up your system and have it "converge" to a runtime profile.

Events are not constrained by the source

This is another beef that I have with classic event systems... they're closed to extension.

If I rely on the the event source to define what it means to be "ready" then if I need to react to something subtly different I either have add a bunch of caveats to the place where I'm reacting, or modify the source code where the event I want to detect should be raised.

On the other hand, if the source merely radiates its state whenever it changes, then the consumers can decide on what events that state implies for themselves, there's no limit on the number of events and ways that you can react.

Serializable

If we've got a bunch of process distributed over the network, then it makes it easy to distribute events as well just by exchanging states wholesale across the wire. I can't invoke a callback handler that exists in an entirely different process, but if I'm just matching on a sub-state that was received from another process, I actually can.

Anyhow, we don't have to come to a decision here, and I think we definitely need to be able to have flexibility about what exactly a state is (not just something that is computed off the Fork object) but I just wanted to clarify where I'm coming from because I do really feel like the "true ultimate power" ™️ will come when we can get to matching on states, rather than "true" events, as the primary reactive primitive.

If I had to sum it up, I'd very much like to explore the idea of a "structured" approach to state, events, and reactivity where every event can be traced back to some fundamental state change. I don't know if this PR is the place to do it though 😆

I've gone on long enough so I'll just end, but very much looking forward to discussing this further!

cowboyd commented 4 years ago

I'm not sure how to express that purely with forks either, since as you pointed out, splitting the main forked process into two isn't really an option, since that will prevent proper resource cleanup.

I think the key is to be able to cleanly map internal fork state to some external state. So the fork stat e of actually binding the listener is the state that does contain what we're after, so what would be nice is a way of saying "hey, I want to map this particular fork state to some named property of an external state object"

It might help to try and manually surface fork state before attempting to do it declaratively and even automatically.

I think an interesting longer-term solution could be to actually add this sort of mechanism to Effection itself, since it's presumably pretty common to have processes which have some kind of asynchronous start up (and maybe clean up, as well?)

One benefit of the design that I've chose here, which I didn't mention and which may not be super apparent at the moment, is that it's working towards future changes which we'll likely need. Mainly being able to treat the various servers as objects and event sources. This will allow coordination between the various parts of the system. My thinking here is that the orchestrator will pass itself to all the children it starts, and then we could do something like this:

I agree. This is going to be common enough that how you handle these things needs to be part of the complete Effection narrative.

Still not sure what form it should take 🤔

cowboyd commented 4 years ago

I just found this which also seems to have some overlap with Effection https://github.com/phenax/algebraic-effects

It seems like there is some generalizations here around the side effects beyond just evaluation. I'm particularly interested in the State effect....

jnicklas commented 4 years ago

Closing this PR, since it was merged as part of #27