brettwooldridge / NuProcess

Low-overhead, non-blocking I/O, external Process implementation for Java
Apache License 2.0
712 stars 84 forks source link

Adds ability to run processes synchronously (#95). #104

Closed tnymlr closed 4 years ago

tnymlr commented 5 years ago

As per discussed in #95, this one implements ability to run processes synchronously with pumping happening on the client thread.

To achieve that, it extends builder, factory and process classes with run method that starts the process and then creates new event processor per client to listen for events. Processor then executes event loop the same way as it does for asynchronous code path.

The change is tested with new runProcessSynchronously test in CatTest.

Additionally it bumps minor version, because we've added a new feature here.

bturner commented 5 years ago

I've reviewed this change internally, prior to @dikeert opening this pull request, just to try and ensure we put up something as straightforward as possible. We've tried pretty hard to keep the change as small as possible, avoiding incidental changes where possible. While this change still ends up being fairly large, it's adding a fairly big feature.

As @dikeert indicated, we're happy to make any changes you'd like to see in order to move this forward. As discussed on #95, getting some form of pump-on-thread synchronous execution added to NuProcess is critical to our ability to use it for Bitbucket Server. We recognize that your time is both valuable and limited, so if there's anything we can do here to help, please let us know!

(Also, not sure what's up with the Shippable build; that doesn't seem like it's related to anything we've changed here. That looks more like an infrastructure failure to me.)

bturner commented 5 years ago

Apologies @brettwooldridge, this isn't quite ready for review. CatTest.LottaProcessListener isn't quite rigorous enough in its checks to detect the case where onStdinReady and onStdout are both never called, so runProcessSynchronously passes even though, on Windows, the stdin/stdout/stderr are never actually pumped. That's a trivial fix, but once that's fixed there's some secondary issue with processes hanging. (It all really does work on macOS and Linux, I should add.)

If you haven't already reviewed, please hold off while we do some further testing, and add some additional unit tests.

bturner commented 5 years ago

Just noting that I've found the issue(s) with the Windows changes and have some local fixups that (legitimately) pass all the tests. We'll circle back with an update to this pull request soon.

bturner commented 4 years ago

@brettwooldridge I think this pull request should now be ready for review. I've fixed all the issues I could find (though I'm keen to get your feedback on whether you'd like any of them solved differently), and significantly expanded the test coverage for synchronous processes.

I've also updated Bitbucket Server to build against a snapshot of these changes and run some of its own tests. I've verified that those tests all pass on Linux, macOS and Windows. (At this point we're working internally on switching more of our code over to NuProcess, which will steadily increase the amount of second-hand NuProcess coverage our tests provide.)

brettwooldridge commented 4 years ago

I need to review this fully, but one possible recommendation for handling a process that never exits, and therefore hangs the runSynchronously() call, is to allow a maxRuntime parameter to be provided. What do you think?

bturner commented 4 years ago

An execution timeout would be a nice improvement. The tricky part is going to be deciding how to apply it, though. waitpid doesn't have any form of timeout. We either use WNOHANG and get zero wait at all (perhaps surrounded by some wait in Java), or we don't pass any options and it sits until it's done. I don't know whether a Thread.interrupt() in Java-land would interrupt the waitpid call, but I suspect it wouldn't.

Do you have any preferences on how such a timeout would be applied? Tight loop rechecking? Short sleep in Java to "slow down" the tight loop? (Bearing in mind that any sleep is going to basically be a straight runtime penalty for all well-behaved processes, as I wrote up above)

bturner commented 4 years ago

What do you think about a Java wait with a backoff? Maybe 2 requests with no delay, then 1ms, then 3ms, then 7ms (essentially doubled plus one), etc? That way the penalty is minimal for well-behaved processes, but we don't hang indefinitely hammering away at waitpid for poorly-behaved ones?

bturner commented 4 years ago

@brettwooldridge I've switched back to WNOHANG for checking synchronous termination, with a Java loop that applies a double-plus-one backoff between checks. RunTest appears to complete in approximately the same amount of time with this looping approach as it did with the blocking wait.

I added some System.out.println calls (removed before I committed) to verify that the loop does sometimes start sleeping. My testing never showed it going past Thread.sleep(1L) before termination was detected, which seems good. I saw three different cases:

If this approach doesn't work for you, or if you've got some improvements in mind, just let me know what you'd like to see.

bturner commented 4 years ago

Thanks for the merge, and the 2.0.0 release, @brettwooldridge.

We've just shipped Bitbucket Server 7.0.0 which now uses NuProcess by default for (almost) all of its process management. We shipped with a fork of NuProcess due to timing, but I've already opened a pull request to switch over to the official 2.0.0 release for 7.1.

There are a few things like @since tags that are showing 1.3.0, in this change. Would you like me to raise a separate pull request to correct those things?