nodejs / node

Node.js JavaScript runtime ✨🐢🚀✨
https://nodejs.org
Other
104.81k stars 28.33k forks source link

workers: initial implementation #2133

Closed petkaantonov closed 7 years ago

petkaantonov commented 8 years ago

https://github.com/nodejs/io.js/pull/1159 retargeted to master

kzc commented 8 years ago

this is good to go.

The spurious Mac test assert failure and EBADF exceptions are not a concern? If Mac is a tier 2 node platform, I get it.

petkaantonov commented 8 years ago

Yeah I forgot about that but it's an assertion error inside libuv's async emulation code so it is probably a libuv bug and should be fixed there. Is it reported to libuv yet?

Morgul commented 8 years ago

I'm not speaking for anyone but myself here, but I'd rather see this go in with the spurious Mac assert (since landing this increases the potential user base of people who will help debug it, as well as just getting it out there to see what the community does with this new feature) than to hold it back at this point.

kzc commented 8 years ago

Reduced test cases for Mac failure(s) is problematic given their spurious nature and requirement of this web worker branch.

kzc commented 8 years ago

Is libuv thread safe and free of race conditions with respect to file descriptors in the presence of threads?

bnoordhuis commented 8 years ago

Is libuv thread safe and free of race conditions with respect to file descriptors in the presence of threads?

@kzc I'm not quite sure how to parse the "with respect to file descriptors" part. Libuv is fork-safe on platforms that let you atomically set the close-on-exec flag on file descriptors. It's thread-safe in the sense that you can call most libuv functions from different threads but not for the same event loop (with the sole exception of uv_async_send().)

File operations that run inside the thread pool are not innately ordered. Two write requests for the same file descriptor can run in either order or even concurrently.

ericmdantas commented 8 years ago

Awesome!

phpnode commented 8 years ago

Extremely keen to see this merged now, it's been a long time coming and since it's exposed behind a flag it certainly seems ready. What's holding up the merge at this point?

kzc commented 8 years ago

@petkaantonov How is fork() handled or its use constrained in the web worker threads implementation? (main thread or other).

http://www.linuxprogrammingblog.com/threads-and-fork-think-twice-before-using-them

Any of the issues in this article applicable to this PR?

heavyk commented 8 years ago

a node fork isn't a real fork(). all "fork" means, is to spawn the same script as a child process, maintaining process.args, execArgs, PATH, etc. it just spawns another node instance with all of that and an ipc channel open for communication. you can emulate the behaviour using cp.spawn actually.

what happens with workers is just what you would expect: node will spawn() itself (same script and everything) with an ipc channel opened to the child. from the "forked" process' point of view, process.parent will be a reference to the parent process. no memory is copied or anything (unlike a real fork). it behaves just like a spawn. either parent or child processes may spawn workers.

https://iojs.org/api/child_process.html#child_process_child_process_fork_modulepath_args_options

This is a special case of the spawn() functionality for spawning io.js processes. In addition to having all the methods in a normal ChildProcess instance, the returned object has a communication channel built-in. See child.send(message, [sendHandle]) for details.

These child io.js processes are still whole new instances of V8. Assume at least 30ms startup and 10mb memory for each new io.js. That is, you cannot create many thousands of them.

piranna commented 8 years ago

Note: Unlike the fork() POSIX system call, child_process.fork() does not clone the current process.

I think doing this can be useful, since the .text section and other things could be shared and some memory earned since they would not be duplicated in memory like how it happens when spawning full new processes... By not preserving the heap and the stack (and maybe file descriptors), real forks could safely done and memory would not be wasted, that's one of the biggest problems of NodeOS...

(Disclaimer: I'm lead developer of NodeOS).

kzc commented 8 years ago

@heavyk - Thanks for the explanation! That makes perfect sense now.

kzc commented 8 years ago

@piranna Perhaps some pre-fork()ing technique before v8 is initialized or any threads are created could work.

heavyk commented 8 years ago

@piranna - don't wanna thread hijack so I made an issue on nodeOS repo to discuss this. https://github.com/NodeOS/NodeOS/issues/170

this issue is already really long as it is. cheers

piranna commented 8 years ago

@piranna - don't wanna thread hijack so I made an issue on nodeOS repo to discuss this. NodeOS/NodeOS#170

:+1: good idea :-)

ronkorving commented 8 years ago

Afaik the OS should already limit the copies of text to 1. It's a simple optimisation that I think kernels have been doing for years. Someone please correct me if I'm wrong.

rektide commented 8 years ago

Original PR is all check marks now. Any chance this can get in before 4.2.0?

rvagg commented 8 years ago

unlikely in the extreme that this'll make it in to v4, sorry, too big and needs time to bake before we could have confidence to support it in lts

dead-claudia commented 8 years ago

This branch has conflicts that must be resolved

I'm guessing that's part of it?

phpnode commented 8 years ago

@impinball well that's a consequence of it hanging around for so long. There's no point anyone doing the work to bring the branch up to date if it's not going to be merged any time soon, because the same thing will just happen again.

dead-claudia commented 8 years ago

There's that as well.

On Mon, Oct 5, 2015 at 6:08 AM, Charles Pick notifications@github.com wrote:

@impinball https://github.com/impinball well that's a consequence of it hanging around for so long. There's no point anyone doing the work to bring the branch up to date if it's not going to be merged any time soon, because the same thing will just happen again.

— Reply to this email directly or view it on GitHub https://github.com/nodejs/node/pull/2133#issuecomment-145486276.

Isiah Meadows

Morgul commented 8 years ago

@rvagg Does LTS have to support experimental features, behind flags, with no documentation? That seems a little much to expect, imho.

While I understand the 'needs time to bake', there's not a lot of baking that will go on without it being available somewhere in a release. I'd love to experiment with this, but not quite enough to take the time to mess with building a special version of node for my experiments.

Disclaimer: I really want to see this feature adopted in node, so I might be a little over zealous to see it merged in. XD

kzc commented 8 years ago

@rvagg Could you please clarify whether official Node releases from the upcoming 5.x branch will be considered to be Stable or Experimental?

bnoordhuis commented 8 years ago

@kzc v5 is stable. We can land experimental features behind flags but the issue with this PR is that it's large (which makes thorough reviews difficult) and touches just about everything (which means the risk of subtle bugs is not insignificant.)

In hindsight, it would have been better to approach this through a series of smaller PRs.

kzc commented 8 years ago

@bnoordhuis What criteria will be used to judge whether this PR is ready for primetime? Reviews can only go so far to catch errors. It won't be seriously tested unless officially released.

Some open source projects have an even/odd minor version number release strategy for stable/development releases. I don't suppose that's in the cards for Node?

bnoordhuis commented 8 years ago

What criteria will be used to judge whether this PR is ready for primetime?

When enough reviewers feel confident it's ready.

Some open source projects have an even/odd minor version number release strategy for stable/development releases. I don't suppose that's in the cards for Node?

You could say that, yes.

Fishrock123 commented 8 years ago

Fwiw, the basic status of this is that because of

(...) but the issue with this PR is that it's large (which makes thorough reviews difficult) and touches just about everything (which means the risk of subtle bugs is not insignificant.)

In hindsight, it would have been better to approach this through a series of smaller PRs.

No-one is super interested in taking responsibility for this, which should be quite understandable.

Also, as I understand from @trevnorris, this means we will have to be more careful when dealing with isolates in future changes, which adds a fairly large long-term debt to this.

targos commented 8 years ago

Maybe we should consider reusing the code that the V8 team did for d8 Workers ? They implemented interesting things like Object serialization (instead of JSON.stringify) and support of Transferable objects.

bnoordhuis commented 8 years ago

The serializer/deserializer in d8 is not hugely sophisticated (which is to be expected, d8 is not production-quality software.) For example, it happily overruns the stack when the object graph is deep.

I've been investigating reusing the serializer/deserializer from Blink (see also https://github.com/nodejs/node/issues/3145#issuecomment-144726777) but it pulls in a lot of additional code. The 200-something files in https://chromium.googlesource.com/chromium/src/+/dba5fc9/third_party/WebKit/Source/bindings/core/v8 are just the tip of the iceberg.

mlogan commented 8 years ago

I have a working fork of this that does serialization/deserialization + transferables using a modified version of the BSON code from https://github.com/audreyt/node-webworker-threads. I also had to fix a few data races in this code that I pointed out earlier.

It's not remotely suitable to be merged into node, and I don't have the time to invest into getting it ready, but I'd be happy to share it if someone wanted to use it as a starting point.

heavyk commented 8 years ago

put it up bro. I'll be glad to use it.

kzc commented 8 years ago

The complexity of these other serialization schemes seem counter to the goal of getting web worker threads released in a timely manner. JSON provides a simple API and a clean separation of concerns between threads. When this PR is folded into a release, it can be optimized then if the need arises.

nathansobo commented 8 years ago

The complexity of these other serialization schemes seem counter to the goal of getting web worker threads released in a timely manner. JSON provides a simple API and a clean separation of concerns between threads. When this PR is folded into a release, it can be optimized then if the need arises.

:+1: Cut scope.

Morgul commented 8 years ago

:+1: Cut scope.

I agree with this completely.

bnoordhuis commented 8 years ago

In case it wasn't clear from context, faster serialization is relevant beyond just this PR. Let's move that discussion to #3145.

spion commented 8 years ago

Regarding the size, @bnoordhuis @Fishrock123 does this mean this particular pull request wont get merged unless its split up into separate smaller ones?

Note that I don't really understand how is node supposed to get any significant new features if large PRs wont get merged. Is it even possible to split this one up?

phpnode commented 8 years ago

And if that's the case, please could you suggest some logical boundaries for those PRs so that other people could pick this up, given that Petka is busy with other things at the moment?

bnoordhuis commented 8 years ago

does this mean this particular pull request wont get merged unless its split up into separate smaller ones?

I wouldn't put it that strongly but it's unlikely to get enough review the way it is now. A good first step would be breaking out the environment cleanup code into its own PR (i.e. the RegisterHandleCleanup() and VisitHandlesWithClassIds() logic.)

listepo commented 8 years ago

+1 :+1:

tflanagan commented 8 years ago

I'd be more than willing to do the grunt work.

Fishrock123 commented 8 years ago

@tflanagan I think we'd probably accept a PR with the stun @bnoordhuis mentioned if you're up for it. :)

tflanagan commented 8 years ago

@Fishrock123 I am more than up for it, this feature definitely needs to keep moving. I'll get started either tonight or tomorrow. :)

19h commented 8 years ago

What is the current progress and state of this PR?

zxc122333 commented 8 years ago

What is the current progress and state of this PR?

trevnorris commented 8 years ago

@zxc122333 Someone with the means and knowledge will need to break up this PR into discrete parts. This PR is simply too large for proper review.

MylesBorins commented 8 years ago

I'm interested in trying to pick this up if possible. I spent a bit of time last night trying to get things to rebase against master and compile, there was a bit of weirdness though (looks like some changes in v8)

If someone was willing to pair with me a bit on irc I'm sure we could work through the errors.

@trevnorris would you have time to collaborate on this a bit next week perhaps?

tflanagan commented 8 years ago

I'm half way through the rebase myself. Haven't had much time to reconcile since I first attempted to.

trevnorris commented 8 years ago

@TheAlphaNerd i'm willing to give feedback/advice on the direction changes should go, but not sure how much time i'll have to help code the change.

jasnell commented 8 years ago

@TheAlphaNerd +1

MylesBorins commented 8 years ago

Been going at this for a bit today.

Worked out the conflicts to the best of my ability, but am dealing with some compilation errors.

../src/env.h:589:20: error: use of undeclared identifier 'HandleWrap'
  typedef ListHead<HandleWrap, &HandleWrap::handle_wrap_queue_> HandleWrapQueue;
                   ^
../src/handle_wrap.cc:71:27: error: member reference base type 'HandleWrapQueue' (aka 'int') is not a structure or union
  env->handle_wrap_queue()->PushBack(this);

edit:

Branch --> https://github.com/TheAlphaNerd/node/tree/worker Patch --> https://github.com/TheAlphaNerd/node/commit/42d1385ecc49c26f73373304d693b1b177149067.patch