Closed petkaantonov closed 7 years ago
this is good to go.
The spurious Mac test assert failure and EBADF exceptions are not a concern? If Mac is a tier 2 node platform, I get it.
Yeah I forgot about that but it's an assertion error inside libuv's async emulation code so it is probably a libuv bug and should be fixed there. Is it reported to libuv yet?
I'm not speaking for anyone but myself here, but I'd rather see this go in with the spurious Mac assert (since landing this increases the potential user base of people who will help debug it, as well as just getting it out there to see what the community does with this new feature) than to hold it back at this point.
Reduced test cases for Mac failure(s) is problematic given their spurious nature and requirement of this web worker branch.
Is libuv thread safe and free of race conditions with respect to file descriptors in the presence of threads?
Is libuv thread safe and free of race conditions with respect to file descriptors in the presence of threads?
@kzc I'm not quite sure how to parse the "with respect to file descriptors" part. Libuv is fork-safe on platforms that let you atomically set the close-on-exec flag on file descriptors. It's thread-safe in the sense that you can call most libuv functions from different threads but not for the same event loop (with the sole exception of uv_async_send()
.)
File operations that run inside the thread pool are not innately ordered. Two write requests for the same file descriptor can run in either order or even concurrently.
Awesome!
Extremely keen to see this merged now, it's been a long time coming and since it's exposed behind a flag it certainly seems ready. What's holding up the merge at this point?
@petkaantonov How is fork() handled or its use constrained in the web worker threads implementation? (main thread or other).
http://www.linuxprogrammingblog.com/threads-and-fork-think-twice-before-using-them
Any of the issues in this article applicable to this PR?
a node fork isn't a real fork()
. all "fork" means, is to spawn the same script as a child process, maintaining process.args
, execArgs
, PATH
, etc. it just spawns another node instance with all of that and an ipc channel open for communication. you can emulate the behaviour using cp.spawn
actually.
what happens with workers is just what you would expect: node will spawn()
itself (same script and everything) with an ipc channel opened to the child. from the "forked" process' point of view, process.parent
will be a reference to the parent process. no memory is copied or anything (unlike a real fork). it behaves just like a spawn. either parent or child processes may spawn workers.
https://iojs.org/api/child_process.html#child_process_child_process_fork_modulepath_args_options
This is a special case of the spawn() functionality for spawning io.js processes. In addition to having all the methods in a normal ChildProcess instance, the returned object has a communication channel built-in. See child.send(message, [sendHandle]) for details.
These child io.js processes are still whole new instances of V8. Assume at least 30ms startup and 10mb memory for each new io.js. That is, you cannot create many thousands of them.
Note: Unlike the fork() POSIX system call, child_process.fork() does not clone the current process.
I think doing this can be useful, since the .text
section and other things could be shared and some memory earned since they would not be duplicated in memory like how it happens when spawning full new processes... By not preserving the heap and the stack (and maybe file descriptors), real forks could safely done and memory would not be wasted, that's one of the biggest problems of NodeOS...
(Disclaimer: I'm lead developer of NodeOS).
@heavyk - Thanks for the explanation! That makes perfect sense now.
@piranna Perhaps some pre-fork()ing technique before v8 is initialized or any threads are created could work.
@piranna - don't wanna thread hijack so I made an issue on nodeOS repo to discuss this. https://github.com/NodeOS/NodeOS/issues/170
this issue is already really long as it is. cheers
@piranna - don't wanna thread hijack so I made an issue on nodeOS repo to discuss this. NodeOS/NodeOS#170
:+1: good idea :-)
Afaik the OS should already limit the copies of text to 1. It's a simple optimisation that I think kernels have been doing for years. Someone please correct me if I'm wrong.
Original PR is all check marks now. Any chance this can get in before 4.2.0?
unlikely in the extreme that this'll make it in to v4, sorry, too big and needs time to bake before we could have confidence to support it in lts
This branch has conflicts that must be resolved
I'm guessing that's part of it?
@impinball well that's a consequence of it hanging around for so long. There's no point anyone doing the work to bring the branch up to date if it's not going to be merged any time soon, because the same thing will just happen again.
There's that as well.
On Mon, Oct 5, 2015 at 6:08 AM, Charles Pick notifications@github.com wrote:
@impinball https://github.com/impinball well that's a consequence of it hanging around for so long. There's no point anyone doing the work to bring the branch up to date if it's not going to be merged any time soon, because the same thing will just happen again.
— Reply to this email directly or view it on GitHub https://github.com/nodejs/node/pull/2133#issuecomment-145486276.
Isiah Meadows
@rvagg Does LTS have to support experimental features, behind flags, with no documentation? That seems a little much to expect, imho.
While I understand the 'needs time to bake', there's not a lot of baking that will go on without it being available somewhere in a release. I'd love to experiment with this, but not quite enough to take the time to mess with building a special version of node for my experiments.
Disclaimer: I really want to see this feature adopted in node, so I might be a little over zealous to see it merged in. XD
@rvagg Could you please clarify whether official Node releases from the upcoming 5.x branch will be considered to be Stable or Experimental?
@kzc v5 is stable. We can land experimental features behind flags but the issue with this PR is that it's large (which makes thorough reviews difficult) and touches just about everything (which means the risk of subtle bugs is not insignificant.)
In hindsight, it would have been better to approach this through a series of smaller PRs.
@bnoordhuis What criteria will be used to judge whether this PR is ready for primetime? Reviews can only go so far to catch errors. It won't be seriously tested unless officially released.
Some open source projects have an even/odd minor version number release strategy for stable/development releases. I don't suppose that's in the cards for Node?
What criteria will be used to judge whether this PR is ready for primetime?
When enough reviewers feel confident it's ready.
Some open source projects have an even/odd minor version number release strategy for stable/development releases. I don't suppose that's in the cards for Node?
You could say that, yes.
Fwiw, the basic status of this is that because of
(...) but the issue with this PR is that it's large (which makes thorough reviews difficult) and touches just about everything (which means the risk of subtle bugs is not insignificant.)
In hindsight, it would have been better to approach this through a series of smaller PRs.
No-one is super interested in taking responsibility for this, which should be quite understandable.
Also, as I understand from @trevnorris, this means we will have to be more careful when dealing with isolates in future changes, which adds a fairly large long-term debt to this.
Maybe we should consider reusing the code that the V8 team did for d8 Workers ? They implemented interesting things like Object serialization (instead of JSON.stringify) and support of Transferable objects.
The serializer/deserializer in d8 is not hugely sophisticated (which is to be expected, d8 is not production-quality software.) For example, it happily overruns the stack when the object graph is deep.
I've been investigating reusing the serializer/deserializer from Blink (see also https://github.com/nodejs/node/issues/3145#issuecomment-144726777) but it pulls in a lot of additional code. The 200-something files in https://chromium.googlesource.com/chromium/src/+/dba5fc9/third_party/WebKit/Source/bindings/core/v8 are just the tip of the iceberg.
I have a working fork of this that does serialization/deserialization + transferables using a modified version of the BSON code from https://github.com/audreyt/node-webworker-threads. I also had to fix a few data races in this code that I pointed out earlier.
It's not remotely suitable to be merged into node, and I don't have the time to invest into getting it ready, but I'd be happy to share it if someone wanted to use it as a starting point.
put it up bro. I'll be glad to use it.
The complexity of these other serialization schemes seem counter to the goal of getting web worker threads released in a timely manner. JSON provides a simple API and a clean separation of concerns between threads. When this PR is folded into a release, it can be optimized then if the need arises.
The complexity of these other serialization schemes seem counter to the goal of getting web worker threads released in a timely manner. JSON provides a simple API and a clean separation of concerns between threads. When this PR is folded into a release, it can be optimized then if the need arises.
:+1: Cut scope.
:+1: Cut scope.
I agree with this completely.
In case it wasn't clear from context, faster serialization is relevant beyond just this PR. Let's move that discussion to #3145.
Regarding the size, @bnoordhuis @Fishrock123 does this mean this particular pull request wont get merged unless its split up into separate smaller ones?
Note that I don't really understand how is node supposed to get any significant new features if large PRs wont get merged. Is it even possible to split this one up?
And if that's the case, please could you suggest some logical boundaries for those PRs so that other people could pick this up, given that Petka is busy with other things at the moment?
does this mean this particular pull request wont get merged unless its split up into separate smaller ones?
I wouldn't put it that strongly but it's unlikely to get enough review the way it is now. A good first step would be breaking out the environment cleanup code into its own PR (i.e. the RegisterHandleCleanup() and VisitHandlesWithClassIds() logic.)
+1 :+1:
I'd be more than willing to do the grunt work.
@tflanagan I think we'd probably accept a PR with the stun @bnoordhuis mentioned if you're up for it. :)
@Fishrock123 I am more than up for it, this feature definitely needs to keep moving. I'll get started either tonight or tomorrow. :)
What is the current progress and state of this PR?
What is the current progress and state of this PR?
@zxc122333 Someone with the means and knowledge will need to break up this PR into discrete parts. This PR is simply too large for proper review.
I'm interested in trying to pick this up if possible. I spent a bit of time last night trying to get things to rebase against master and compile, there was a bit of weirdness though (looks like some changes in v8)
If someone was willing to pair with me a bit on irc I'm sure we could work through the errors.
@trevnorris would you have time to collaborate on this a bit next week perhaps?
I'm half way through the rebase myself. Haven't had much time to reconcile since I first attempted to.
@TheAlphaNerd i'm willing to give feedback/advice on the direction changes should go, but not sure how much time i'll have to help code the change.
@TheAlphaNerd +1
Been going at this for a bit today.
Worked out the conflicts to the best of my ability, but am dealing with some compilation errors.
../src/env.h:589:20: error: use of undeclared identifier 'HandleWrap'
typedef ListHead<HandleWrap, &HandleWrap::handle_wrap_queue_> HandleWrapQueue;
^
../src/handle_wrap.cc:71:27: error: member reference base type 'HandleWrapQueue' (aka 'int') is not a structure or union
env->handle_wrap_queue()->PushBack(this);
edit:
Branch --> https://github.com/TheAlphaNerd/node/tree/worker Patch --> https://github.com/TheAlphaNerd/node/commit/42d1385ecc49c26f73373304d693b1b177149067.patch
https://github.com/nodejs/io.js/pull/1159 retargeted to master