davidlattimore / wild

A very fast linker for Linux
Apache License 2.0
714 stars 18 forks source link

Avoid or reduce shutdown costs #26

Closed davidlattimore closed 1 week ago

davidlattimore commented 4 months ago
          > Depending on how fast hashing is, hashing the output might still be OK if it can be done in parallel with closing all the input files (which is surprisingly slow - run the linker with --time to see).

Unrelated to this issue, but have you considered any of:

(Sorry to hijack the issue, but those ideas just came up as solutions to that subproblem.)

Originally posted by @VorpalBlade in https://github.com/davidlattimore/wild/issues/15#issuecomment-2284853169

Let's move discussion to a new issue, since I'd like to keep that other issue as a good first issue for people to work on.

I've tried a couple of those things. In particular I've tried unmapping files from separate threads and also just leaving them and letting the OS clean them up on exit. Neither seemed to make any difference.

In the case of using multiple threads, I think the issue is that munmap acquires a per-process kernel lock, so the threads end up waiting for each other anyway.

In the case of just leaving things for shutdown, I'm less sure, but I think process termination causes the kernel to do the same unmapping. So termination ends up taking longer. This is somewhat surprising and also feels like the kind of thing that could change with future changes in the Linux kernel, so should probably be revisited at some point.

I haven't tried using io-uring to munmap files. Is that possible? Given the above results, I'd be surprised if this were a faster way to close files.

I have considered using io-uring for all file IO. It'd be a pretty big change from how things currently work though and I suspect might make performance worse rather than better, although that's just a hunch. My reason for suspecting that, is that AFAIK, io_uring will copy the data, even if it's in cache, whereas mmap can reuse the pages of the cache and only needs to update the page tables of the process.

The forking trick would likely work. I haven't done it as yet, but there's a moderate chance that I will eventually.

VorpalBlade commented 4 months ago

You are almost certainly right about io-uring vs mmap, since you said close not unmap in the other issue I misunderstood what you were doing.

Then it sounds like the fork and letting a background process do the actual work is the best way to go. (That actually sounds like a trick I myself might have use for in a program of mine. It should be more widely publicised as a trick.)

andrewdavidmackenzie commented 2 weeks ago

Focusing on using the fork of a child method to do the linking work, and having the parent exit as soon as the output file is written:

I imagine it would be desirable to have a --fork/--nofork command line option to be able to enable/disable this behaviour, combined with a default of "nofork" until proven reliable and no indesireable side-effects, then switch to "fork" be the default after that?

Parent should watch for death of a child signal, so that any unexpected exit (panic) by the child, or uncontrolled exit path, would not leave the parent hanging.

Any other thoughts on that particular feature?

andrewdavidmackenzie commented 2 weeks ago

Considering options on how to "fork" the sub-process

After looking at them all, I sent a WIP PR using fork(): https://github.com/davidlattimore/wild/pull/192

davidlattimore commented 2 weeks ago

Thanks for working on this @andrewdavidmackenzie. Regarding the Command option, you wouldn't need a separate binary, you just need to make sure that you set some flag or environment variable so that the subprocess knows that its the subprocess and doesn't fork again (fork bombs are bad).

andrewdavidmackenzie commented 2 weeks ago

That's true, it could "exec" itself...

VorpalBlade commented 2 weeks ago

Considering options on how to "fork" the sub-process [...]

Nix is also unsafe (for the same reason as libc: fork when multi-threaded is complex). The ni fork appears to be a thin wrapper around libc.

It looks to me like the fork crate should be unsafe but isn't. Fishy. It too is a thin wrapper around libc.

Either way, I don't think the unsafe is a big issue for wild in this case. The fork will presumably happen as early as possible, at which point the parent process will still be single threaded. After the point of fork, either process can do whatever it wants with threads.

I unfortunately have no clear idea into what might be going on with stdio. It should work from a child, as long as stdio isn't locked at the moment of forking. Does wild lock stdio once early and run with it? That might cause issues, it would be better to do that after the fork.

andrewdavidmackenzie commented 2 weeks ago

I imagine Fork crate just has an unsafe {} inside the function, then public function doesn't need to be unsafe.

davidlattimore commented 2 weeks ago

I imagine Fork crate just has an unsafe {} inside the function, then public function doesn't need to be unsafe.

Yeah, but I think what @VorpalBlade is questioning is whether it's legitimate for them to mark the function safe. I can see an argument for it being safe. Forking when threads have been started can lead to deadlock. I'm not aware of a way for it to lead to arbitrary undefined behaviour. But maybe there is. If nothing else, it'd be good to have a comment justifying why it isn't marked unsafe.

andrewdavidmackenzie commented 2 weeks ago

I've discussed this before with others.

My thought was that if a low-level bit of code is unsafe (e.g It calls some unsafe function in libc), and the "rule" is to mark the function containing it unsafe....then what about the functions calling that unsafe function?

If you apply the rule consistently, unsafe will percolate up and you will end up with main() being unsafe.

My approach was to make the unsafe scope as small as possible...

But, I think it's debatable....

Maybe the rule could be to percolate up to a crate API?

davidlattimore commented 2 weeks ago

It's not that anything that uses unsafe should itself be marked as unsafe. It's more that your function should be unsafe unless you can be sure that you're upholding all of the safety requirements no matter how users of your function call it.

VorpalBlade commented 2 weeks ago

https://www.man7.org/linux/man-pages/man3/fork.3p.html (note the 3p, indicating this is from the POSIX standard). If you search the page for "undefined" there are several hits. Some of them seem to be about atfork handlers, but not all.

https://www.man7.org/linux/man-pages/man2/fork.2.html says

After a fork() in a multithreaded program, the child can safely call only async-signal-safe functions (see signal-safety(7)) until such time as it calls execve(2).

This doesn't say that doing otherwise is undefined, but it also doesn't say what would happen if you do break that rule.

https://www.man7.org/linux/man-pages/man7/signal-safety.7.html only uses wording like "unpredictable results". (But this is a Linux man page, not a POSIX one)

Given that arbitrary signal handlers are generally considered unsafe in Rust, and that this is adjecent to that minefield I'm not comfortable with the idea of just reexposing fork as safe without explaining why it safe in this specific use. That could be "this application is currently single threaded" (which is a decision a library can never make), or "we only execute async signal safe functions after the fork" (which precludes arbitrary caller provided code from being executed).

Note that the standard library also take this stance: CommandExt::pre_exec is unsafe.

As such I believe the fork library to be unsound.

EDIT: See also https://internals.rust-lang.org/t/why-no-fork-in-std-process/13770/4 and the transitivly linked discussions.

andrewdavidmackenzie commented 2 weeks ago

That sounds good, like there are plenty of cases when fork can have problems, and a library cannot know if that is the case

Related: I see there's now a "standard" for doc comments on a function to explain it's safety for such cases (if you can indeed control it...)

On Sun, Nov 24, 2024, 9:29 AM Arvid Norlander @.***> wrote:

https://www.man7.org/linux/man-pages/man3/fork.3p.html (note the 3p, indicating this is from the POSIX standard). If you search the page for "undefined" there are several hits. Some of them seem to be about atfork handlers, but not all.

https://www.man7.org/linux/man-pages/man2/fork.2.html says

After a fork() in a multithreaded program, the child can safely call only async-signal-safe functions (see signal-safety(7)) until such time as it calls execve(2).

This doesn't say that doing otherwise is undefined, but it also doesn't say what would happen if you do break that rule.

https://www.man7.org/linux/man-pages/man7/signal-safety.7.html only uses wording like "unpredictable results".

Given that arbitrary signal handlers are generally considered unsafe https://docs.rs/signal-hook/latest/signal_hook/low_level/fn.register.html in Rust, and that this is adjecent to that minefield I'm not comfortable with the idea of just reexposing fork as safe without explaining why it safe in this specific use. That could be "this application is currently single threaded" (which is a decision a library can never make), or "we only execute async signal safe functions after the fork" (which precludes arbitrary caller provided code from being executed).

Note that the standard library also take this stance: CommandExt::pre_exec https://doc.rust-lang.org/std/os/unix/process/trait.CommandExt.html#tymethod.pre_exec is unsafe.

As such I believe the fork library to be unsound.

— Reply to this email directly, view it on GitHub https://github.com/davidlattimore/wild/issues/26#issuecomment-2495870536, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABKF4LBLIW2CQTQ46ZZA33L2CGE6DAVCNFSM6AAAAABMNDRXA6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIOJVHA3TANJTGY . You are receiving this because you were mentioned.Message ID: @.***>

davidlattimore commented 1 week ago

Fixed by @andrewdavidmackenzie in #192. Thanks!