Closed sagebind closed 8 years ago
I'll copy my general thoughts here as well from earlier:
execute()
closure-style isn't particularly bad, but the design seems optimized for the use-case of multiple event loops in one program, which I thought isn't something we were trying to encourage (not that it should necessarily be forbidden).execute()
is called, but I don't think it helps much anywhere else in a program.execute()
approach allows. Forking doesn't treat previous scoping nicely in that regard.The execute() closure-style isn't particularly bad, but the design seems optimized for the use-case of multiple event loops in one program, which I thought isn't something we were trying to encourage (not that it should necessarily be forbidden).
Why does it seem optimized for multiple event loops for you? The usual way to do things in Amp is Amp\run(function() { /* ... */ });
.
Regarding forking, I think you would just start a new loop inside or better fork before you start the loop.
Why does it seem optimized for multiple event loops for you? The usual way to do things in Amp is
Amp\run(function() { /* ... */ });
.
I don't have a technical answer to that. :grinning: Merely that, in typical usage, you'll have "setup" code outside of the closure before the run()
, and everything else inside the closure. It makes the code look disproportionate.
Regarding forking, I think you would just start a new loop inside or better fork before you start the loop.
For forking that may work... I assume you mean something like this:
function runInFork(callable $work) {
pcntl_fork();
Loop::execute($work, ...some loop...);
exit();
}
Loop::execute(function() {
// some stuff
runInFork(function() {});
}, $loop);
The issue is that the parent loop is left in memory in the forked process, and then is effectively killed while running at the exit()
Not exactly a clean escape. The situation in threads is actually worse, as persisting the previous event loop instance beyond the entry-point method could cause pthreads to try and serialize it.
Requiring users to create all forks and threads before any asynchronous code seems like an artificial requirement that limits their usefulness.
I guess the main problem is that execute()
guarantees a balanced stack of execution, but forks and threads need to discard the parent's stack history.
I have to say: I didn't use forking in combination with event loops yet.
I have no idea about the forking either; never did that with an event loop. You theoretically could Loop::stop()
and start a separate loop in the child process then.
Actually, we have this problem, when we want to support something like wait()
.
In that case you aren't inside the run() function [callback], but need to set events on the loop.
I though disagree that Loop::set() is the solution to that. It feels wrong to force the user to manually set the loop in a sync application… Don't know :-/
In Amp we go around that issue by creating a default loop if none is set yet...
Perhaps to avoid Loop::set()
we could only provide Loop::drop()
in addition to Loop::execute()
? Drop is less likely to be mis-used by accident, though we still have the issue of pthreads not playing all that nicely with closures.
I think right now I lean toward providing all of the methods above, since most of the time Loop::execute()
provides the best interface for users, but Loop::set()
and Loop::drop()
are still available for those situations where you need them.
@bwoebi Oh, I think I just understood what you meant :stuck_out_tongue:
Actually, we have this problem, when we want to support something like
wait()
. In that case you aren't inside the run() function [callback], but need to set events on the loop.
You mean like this:
// I'm so synchronous...
$request = new SomeAsyncLibThing();
// Block all the things...
$response = EventLoop\wait($request->send()); // <-- we need to run someone's loop here, but then discard it when we have the result...
Yes, that's the issue... You want to ideally have the user unaware of the event-loop under the hood…
@coderstephen The only solution I can think of would be the event loops implementations setting a default loop in an always loaded file (i.e. files entry with composer loader).
We'd need set() then, but we'd disallow it's usage inside Loop::execute().
It also possibly should be possible to allow (top-level) Loop::execute() to run without $loop parameter then (you just use the auto-set one).
That'd have the advantage of abstracting the actual loop away completely...
@bwoebi: we could parse composer.json
for the "extra"
field, but that's not performant. So maybe Loop::setFactory(callable)
and the event loop implementation will have a files file settings that.
@kelunik yeah, exactly that.
I added a proposal as PR: #30
@bwoebi If set()
is provided, I agree it should not be allowed while running in an execute()
. My PR already handles that case, I think.
Regardless of how the driver instance is actually obtained, I think we still need the extra degree of control with set()
and drop()
for those certain cases, e.g. multiprocessing.
A factory is really handy in this case though as well. The example runInFork()
earlier could be improved to be:
function runInFork(callable $work) {
pcntl_fork();
Loop::drop();
Loop::set(Loop::createDriver());
$work();
Loop::run();
exit();
}
or even better:
function runInFork(callable $work) {
pcntl_fork();
Loop::drop();
Loop::execute($work);
exit();
}
I'm not sure the latter plays nicely with pthreads 2 because of the closure, but at least it is possible without persisting the parent event loop in memory.
I think we don't need set()
, drop()
will be enough?
@coderstephen Could you clarify: Why do you need to fork inside an event loop? What's the actual use case for it?
@kelunik On further consideration, you may be right; drop()
is probably enough.
Fork in an event loop for multiprocessing. The goal isn't to fork the event loop, but to run user code in parallel, especially blocking code. See Icicle concurrent and Icicle filesystem for use cases. The event loop being duplicated into a child process is an inconvenience, not a benefit, in these cases.
I get that, but what's the benefit of forking vs. simply launching a worker process and maybe even reuse that? Because not only your loop will be duplicated, but every thing else, too.
Speed. Inheriting functions being passed, not having to re-load classes from disk in the class loader. Memory copy is faster than disk access.
I'm not sure the benefits of forking and multithreading are relevant to this issue though. The fact that someone wants to do it should be enough that the standard should be compatible with this use case.
@coderstephen I'm not sure why exactly you want drop()? You might just run another loop inside the current loop:
function runInFork(callable $work) {
if (pcntl_fork() == 0) {
Loop::onError(null); /* eventual exceptions should not be caught */
Loop::execute($work); /* it will just normally execute another loop inside the main loop */
exit(); /* the main loop is never continued as we exit here. */
}
}
Or am I missing anything here?
Sure, the old event loop will appear in the stacktrace, but that's already it.
@bwoebi And the old loop will still be there in memory with all it's watchers and so on. Like every other object that has been created, too. That's why I questioned whether forking is a good idea here.
It will be in memory, but that's unavoidable, as it's still referenced in the stacktrace. There's no actual way to destroy the loop. But @coderstephen is probably right, it's still faster to copy a bit of memory (I assume it'll be < 50 MB (time to copy, uh, less than 10 ms (depends on RAM speed)). [But yeah, you shouldn't do that in an Aerys server for example as memory may spike and when you fork() processes of 500 MB multiple times per second you have a problem.]
For most cases -d opcache.enable_cli=1
is good enough. It should be only marginally slower than direct forking.
Yeah, you definitely don't want to be forking for every request that comes in when writing a web server, but managing a fork pool comes in handy. :grinning:
It will be in memory, but that's unavoidable, as it's still referenced in the stacktrace. There's no actual way to destroy the loop.
@bwoebi I'm not trying to stir the pot or anything, but the only reason the loop is referenced in the stacktrace is because of Loop::execute()
. A plain get()
/set()
system does not have this issue, though admittedly just that approach has other inconveniences.
@coderstephen we'll still have to start the loop in some way … whether it's now via a get/set/run or execute function. And this will then be in the stacktrace. No way around that.
Also, proc_open() is really not that bad, especially if you have opcache for fast startups.
@coderstephen Can we close this now, as it can't work?
Closing as it doesn't work as the loop will always be present in the backtrace.
This is a new issue to aggregate some discussion that happened in #14 and #17 concerning how the global event loop instance is controlled.
Currently we are offering
Loop::execute()
to set the loop driver andLoop::get()
to fetch the instance. Is this the best approach, or should we also provide aget()
/set()
-style control as well? We discussed this a bit previously, but I wasn't sure if we came to a conclusive stance.Here's a summary of the arguments so far that I recall:
execute()
andget()
: Eliminates the possibility of people messing with loop instances in ways they shouldn't, provides clear scoping at the entry of the program.get()
,set()
, anddrop()
: Greater flexibility, nesting in a closure is not mandatory.I closed my old PR as the purpose had already been solved: adding a static class for storing the global driver instance. I opened a new PR (#27) with a reference implementation for discussion.