libuv / libuv

Cross-platform asynchronous I/O
https://libuv.org/
MIT License
24.01k stars 3.59k forks source link

uv_spawn is slow when parent process allocates mmap with MAP_JIT flag #3050

Closed deepak1556 closed 2 years ago

deepak1556 commented 3 years ago

Repro Steps : https://github.com/deepak1556/libuv-spawn

The issue is derived from https://github.com/electron/electron/issues/26143

Applications running on hardened runtime based on chromium/electron create mmap regions using MAP_JIT flag. With Big Sur the fork() calls done by uv_spawn have become slow. Theory is that before big sur fork() didn't have access to the JIT memory regions but they do now.

// Using the repro linked above

$ sudo dtruss -e ./out/Default/uv_spawn --use-jit

45600 fork()         = 7815 0
42260 fork()         = 7816 0

$ sudo dtruss -e ./out/Default/uv_spawn

558 fork()       = 7801 0
391 fork()       = 7800 0

Chromium's own child process spawning code doesn't suffer from this new behaviour because it uses posix_spawn.

Is it in the interest of libuv project to make the change to use posix_spawn atleast for macOS >= 11.0 ?

bnoordhuis commented 3 years ago

Thanks for the repro. I don't run Big Sur so I have no way to test myself but your analysis sounds right.

posix_spawn() on macOS is an actual system call (on other Unices libc often emulates it through fork() + execve()) with a shortcut to spawn the new process directly. No memory mappings are copied, ergo, it's very fast.

Switching to posix_spawn() isn't an option in general - it's too limited for libuv's use cases - but it could be an option for macOS because you can use platform-specific extensions.

At the very least you'll need:

I won't be working on that myself but you or anyone else is welcome to send a pull request.

marcello3d commented 3 years ago

The slow speed is unfortunate, but what's worse is that it locks the JavaScript thread (in Electron, freezing the app UI), before nodejs' spawn() call even returns.

I'm not familiar with what part of that is node vs libuv, but is it possible for this slow part to happen asynchronously (since the rest of the spawn api is async)?

deepak1556 commented 3 years ago

Thanks for the added context! I will work on the fix this week.

pdesantis commented 3 years ago

@deepak1556 Do you think this will be fixed this year? Is there any way we can help? Our Electron app is unusable on Big Sur because it relies heavily on child processes, so we’re trying to determine if we should build workarounds or contribute to the fix. Thanks again for the help!

deepak1556 commented 3 years ago

@pdesantis the issue is on our projects' December iteration task but can't promise a date and I haven't started on it. I also was not aware if anyone else wanted to work on it. So, feel free to contribute!

All the relevant api swaps that @bnoordhuis pointed are to be made in src/unix/process.c, specifically under fn uv__process_child_init and uv_spawn

pdesantis commented 3 years ago

@deepak1556 thanks! Taking a crack at it today, although don't get your hopes up too high since we're new to this codebase 😅. Will report back by the end of the day.

pdesantis commented 3 years ago

@deepak1556 @bnoordhuis we've made some progress here: https://github.com/descriptinc/libuv/commit/58cf78e29ff19edc15321655a807d3a54632b688 (shoutout to @jpcanepa!)

It's incomplete but seems promising so far. Using the repro project https://github.com/deepak1556/libuv-spawn we're seeing ls launch & exit times of 0.5ms & 3ms, which is inline with the times from the unsigned executable using fork (actually, they're slightly faster!)

Any tips on porting the file descriptor (options->stdio) logic?

pdesantis commented 3 years ago

Disregard my previous question, @jpcanepa posted a PR here https://github.com/libuv/libuv/pull/3064

jpcanepa commented 3 years ago

I want to make sure that eveyrone involved here is aware that that PR (#3064) is in draft because I need help. I can continue trying to bang things into compiling correctly, but I kind of hope that someone (@deepak1556?, @bnoordhuis?) can chip in as well.

bnoordhuis commented 3 years ago

As a bit of expectation management: I have very little time at the moment.

jpcanepa commented 3 years ago

I've updated the PR so that now it builds correctly with CMake and Autotools, and there are only two open questions: a failing unit test case due to a behavior difference from the fork/execvp flow and a question about one of the posix_spawn_*_np function calls.

jpcanepa commented 3 years ago

FWIW, the PR #3064 seems to be done, just needs some attention for a final review, to get it merged.

LukeXF commented 3 years ago

FWIW, the PR #3064 seems to be done, just needs some attention for a final review, to get it merged.

Looking forward to seeing this fixed so GitKraken will run again without it being slowed down (issue)!

cjihrig commented 3 years ago

Reopening because https://github.com/libuv/libuv/pull/3064 was reverted.

memark commented 3 years ago

Any progress here?

genesiscz commented 3 years ago

Still happens

fredleger commented 3 years ago

any progress here ?

darkguy2008 commented 3 years ago

@cjihrig it seems the patch is merged as of today, so is there any progress on this?

cjihrig commented 3 years ago

@darkguy2008 it was merged, but then reverted in https://github.com/libuv/libuv/pull/3107.

larssvensby commented 3 years ago

Any progress here?

jezmck commented 3 years ago

What's the latest on this?

fredleger commented 3 years ago

I add to apply the fix for the latest build of gitkraken so i guess it's still ongoing or gitkraken has not included the fix released if it's exists.

What i don't get is that some apps suffered the same issue after Bug Sur was out (from memory i can only name VScode but they was more than 2 apps) and they managed to get a fix out quite quickly. Is there something special about this lib or it's just on Gitkraken side ?

herberthobregon commented 2 years ago

Will this ever be solved? Any alternative to this program?

tangix commented 2 years ago

With Big Sur possibly now in maintenance mode - is Kraken working better in Monterey? Removing code-signing isn't a way forward in my book.

idunno101 commented 2 years ago

If anyone wants a quick overview on this whole situation; this issue summarises it nicely.

Right now, the fix is at #3257 but Apple needs to diagnose a kernel bug before that can be merged.

lingster commented 2 years ago

I've just tried using GitKraken's new cli-terminal using Monterey and GitKraken grinds to a halt as the renderer consumes vast amounts of cpu time. Running the workaround: codesign --remove-signature /Applications/GitKraken.app/Contents/Frameworks/GitKraken\ Helper\ \(Renderer\).app returns the following error: /Applications/GitKraken.app/Contents/Frameworks/GitKraken Helper (Renderer).app: internal error in Code Signing subsystem

hoping that there will be a fix for this soon.

luclu commented 2 years ago

@lingster try sudo ...

r0kk commented 2 years ago

I am using Big Sur and it everything worked perfectly well till previous week, and now it stopped. codesign ... doesn't seems to resolve the issue.

Is there any solution to this? Did this somehow reoccur? I like gitkraken, but will be force to stop paying yearly subscription, since it is not useable for me at the moment.

Thank you for any help.