NixOS / nixpkgs

Nix Packages collection & NixOS
MIT License
16.83k stars 13.22k forks source link

Musl as default instead of glibc #90147

Open domenkozar opened 4 years ago

domenkozar commented 4 years ago

Thanks to many distributions such as Alpine Linux, a lot of software compiles with musl nowadays.

It would mainly reduce the closure size significantly and improve static linking support. See https://www.etalabs.net/compare_libcs.html

It would be interesting to see how much compiles under pkgsMusl attribute and make the switch at some point.

vcunat commented 4 years ago

I don't think as default, but musl is still interesting to me, e.g. pkgsStatic. IIRC some people do submit musl-specific fixes, but I haven't seen that significant interest around NixOS.org so far.

zzywysm commented 4 years ago

musl advertises that it is "lightweight, fast, simple, free, correct", but one thing they do not advertise is resistance against exploitation.

glibc has long been in an arms race against hackers, and as new techniques are found for attacking the glibc heap, glibc introduces mitigations against them. A small part of this history is found here:

http://phrack.org/issues/61/6.html http://phrack.org/issues/66/10.html https://github.com/shellphish/how2heap

What mitigations does musl offer to protect users from malicious input attempting to corrupt their processes' heaps? Without this information, we can't tell if switching to musl is a security regression or not.

vcunat commented 4 years ago

Oh, now I recalled that systemd tends to be problematic: https://github.com/NixOS/nixpkgs/pull/37715

Part of the reason will be that systemd does lots of low-level stuff, another part that people choosing "lightweight" libcs usually don't care much for systemd... for some mysterious reasons ;-) (the correlation might go both ways)

Of course, problems can be patched, etc. For most packages one can probably find a solution somewhere already. The key question there would be whether there's enough motivation to maintain also this divergence from majority.

zzywysm commented 4 years ago

What mitigations does musl offer to protect users from malicious input attempting to corrupt their processes' heaps? Without this information, we can't tell if switching to musl is a security regression or not.

To determine the level of robustness between the glibc and musl malloc implementations, I constructed the following (rough) metric.

I searched through each implementation's malloc.c file to see how many error conditions will lead to an immediate abort. The results:

musl-1.2.0: src/malloc/malloc.c:388: if (extra & 1) a_crash(); src/malloc/malloc.c:406: if (next->psize != self->csize) a_crash(); src/malloc/malloc.c:450: if (next->psize != self->csize) a_crash(); src/malloc/malloc.c:515: if (extra & 1) a_crash(); (4 results)

glibc-2.31: malloc/malloc.c:1454: malloc_printerr ("corrupted size vs. prev_size"); malloc/malloc.c:1460: malloc_printerr ("corrupted double-linked list"); malloc/malloc.c:1468: malloc_printerr ("corrupted double-linked list (not small)"); malloc/malloc.c:2537: malloc_printerr ("break adjusted to free malloc space"); malloc/malloc.c:2830: malloc_printerr ("munmap_chunk(): invalid pointer"); malloc/malloc.c:2858: malloc_printerr("mremap_chunk(): invalid pointer"); malloc/malloc.c:3175: malloc_printerr ("realloc(): invalid pointer"); malloc/malloc.c:3594: malloc_printerr ("malloc(): memory corruption (fast)"); malloc/malloc.c:3644: malloc_printerr ("malloc(): smallbin double linked list corrupted"); malloc/malloc.c:3736: malloc_printerr ("malloc(): invalid size (unsorted)"); malloc/malloc.c:3739: malloc_printerr ("malloc(): invalid next size (unsorted)"); malloc/malloc.c:3741: malloc_printerr ("malloc(): mismatching next->prev_size (unsorted)"); malloc/malloc.c:3744: malloc_printerr ("malloc(): unsorted double linked list corrupted"); malloc/malloc.c:3746: malloc_printerr ("malloc(): invalid next->prev_inuse (unsorted)"); malloc/malloc.c:3786: malloc_printerr ("malloc(): corrupted unsorted chunks 3"); malloc/malloc.c:3868: malloc_printerr ("malloc(): largebin double linked list corrupted (nextsize)"); malloc/malloc.c:3874: malloc_printerr ("malloc(): largebin double linked list corrupted (bk)"); malloc/malloc.c:3957: malloc_printerr ("malloc(): corrupted unsorted chunks"); malloc/malloc.c:4061: malloc_printerr ("malloc(): corrupted unsorted chunks 2"); malloc/malloc.c:4107: malloc_printerr ("malloc(): corrupted top size"); malloc/malloc.c:4173: malloc_printerr ("free(): invalid pointer"); malloc/malloc.c:4177: malloc_printerr ("free(): invalid size"); malloc/malloc.c:4201: malloc_printerr ("free(): double free detected in tcache 2"); malloc/malloc.c:4249: malloc_printerr ("free(): invalid next size (fast)"); malloc/malloc.c:4266: malloc_printerr ("double free or corruption (fasttop)"); malloc/malloc.c:4276: malloc_printerr ("double free or corruption (fasttop)"); malloc/malloc.c:4288: malloc_printerr ("invalid fastbin entry (free)"); malloc/malloc.c:4309: malloc_printerr ("double free or corruption (top)"); malloc/malloc.c:4314: malloc_printerr ("double free or corruption (out)"); malloc/malloc.c:4317: malloc_printerr ("double free or corruption (!prev)"); malloc/malloc.c:4322: malloc_printerr ("free(): invalid next size (normal)"); malloc/malloc.c:4332: malloc_printerr ("corrupted size vs. prev_size while consolidating"); malloc/malloc.c:4356: malloc_printerr ("free(): corrupted unsorted chunks"); malloc/malloc.c:4477: malloc_printerr ("malloc_consolidate(): invalid chunk size"); malloc/malloc.c:4493: malloc_printerr ("corrupted size vs. prev_size in fastbins"); malloc/malloc.c:4553: malloc_printerr ("realloc(): invalid old size"); malloc/malloc.c:4564: malloc_printerr ("realloc(): invalid next size"); (37 results)

Admittedly, this is not a perfect metric, because the glibc malloc is more complicated, has a fastpath and slower paths, etc. But in general it seems like glibc is being much more careful.

I think sticking with glibc is the smarter decision from a security perspective.

markuskowa commented 4 years ago

@domenkozar what would be the advantages of musl over glibc?

domenkozar commented 4 years ago

@markuskowa smaller closure size, easier static linking. See https://www.etalabs.net/compare_libcs.html

samueldr commented 4 years ago

Having tried to use pkgsStatic for the Mobile NixOS stage-1, as @vcunat said, systemd won't play ball, and we need a bunch of work still to make a large proportion of Nixpkgs work. A bunch of trivial-enough things didn't work, some was fixed, some was worked around with alternatives. In the end I decided to go with glibc.

Though, with that said, I'm not opposed to the idea, but as a default, I'm not sure when and if it'll happen. It'd need many person-hours to get there.

Hello71 commented 4 years ago

What mitigations does musl offer to protect users from malicious input attempting to corrupt their processes' heaps? Without this information, we can't tell if switching to musl is a security regression or not.

To determine the level of robustness between the glibc and musl malloc implementations, I constructed the following (rough) metric.

I searched through each implementation's malloc.c file to see how many error conditions will lead to an immediate abort. The results:

musl-1.2.0: (4 results)

glibc-2.31: (37 results)

Admittedly, this is not a perfect metric, because the glibc malloc is more complicated, has a fastpath and slower paths, etc. But in general it seems like glibc is being much more careful.

I think sticking with glibc is the smarter decision from a security perspective.

musl 1.2.0 malloc.c: 548 lines glibc 2.31 malloc.c: 5610 lines

musl asserts per line: .007299 glibc asserts per line: .006595

therefore, musl is more secure.

thestinger commented 4 years ago

glibc malloc has more sanity checks. At least compared to glibc before thread caching was added (major caveat), the current generation musl malloc is not security aware and is friendlier to exploitation. Both use a similar design based on trusted inline metadata that's inherently friendly to exploitation. glibc now has thread caches which are a huge boon for easier and more reliable exploitation, especially in complex situations involving parallelism, while musl won't be taking that approach. Thread caches bypassed a substantial amount of the previous work put into bolting on sanity checks to glibc malloc. They also make exploitation much more reliable for threaded programs and fundamentally get in the way of more meaningful deterministic security checks.

musl has a new malloc implementation landing soon, with a fundamentally more security-oriented design than glibc's approach. It isn't possible to build the same kind of security through adding weak sanity checks like glibc. It's a base that can be turned into a truly hardened allocator by bolting on additional features, unlike the traditional dlmalloc-like design used by glibc and musl (glibc makes the fundamental issues worse with tcache). You can't build decent security by bolting it on top of a fragile foundation preventing robust security checks. I strongly recommend reading this thread from Rich Felker on some of the security properties of oldmalloc vs. malloc-ng, and perhaps the other discussions about it on the mailing list and Twitter:

https://www.openwall.com/lists/musl/2020/05/13/1

If you care about exploit mitigation, wait until the next generation malloc lands. There are also other security features missing that should be added.

FORTIFY_SOURCE isn't implemented in musl, although the glibc implementation is lackluster (only checks writes, not reads, etc.) and doesn't bother with Clang compatibility. Clang actually has superior extensions for implementing it... but an implementation compatible with both that's strictly better than the glibc one is straightforward. I think the stance on this is that it should be done with a header-only approach, but I don't think a high quality production implementation is available, so it's effectively not available.

musl also doesn't currently have setjmp or much attempt at function pointer protection, although the implementations in glibc are lackluster and quite incomplete. Only really matters if you're using type-based CFI elsewhere, and neither musl or glibc has support for Clang type-based CFI. Neither supports the arm64 ShadowCallStack feature which is the approach to protecting return addresses there (hopefully CET shows up for x86 soon for a superior hardware implementation, and MTE on arm64 for similar reasons). ShadowCallStack doesn't exist for x86 since their approach didn't meet the same standards (races, etc.).

There's a fork of musl for Fuchsia with some of these features added, but not in a way that would ever be possible to land upstream since they wrote it in C++ and did it specifically for that OS with assumptions that wouldn't hold elsewhere.

glibc has a lot of security misfeatures and far more attack surface / bugs. Exploit mitigation isn't everything, especially when it's done poorly as glibc does in many cases. It also introduces problems with features like secondary stack caching that are not present in musl. I think it's a mixed bag right now. malloc is quite important so there's a very strong argument that the current generation glibc beats musl on exploit mitigation (and the libc attack surface isn't substantial in the big picture anyway) but that's not going to be true for much longer.

kaniini commented 4 years ago

FORTIFY_SOURCE isn't implemented in musl,

While it is true that musl itself does not ship with a FORTIFY_SOURCE implementation, Alpine (and other apk distributions) have support for FORTIFY_SOURCE thanks to the work of our toolchain maintainers, actually. We believe it to be of higher quality than the glibc one.

But Nix's userbase is probably best off staying with glibc because there are a number of differences (even with layering other libraries on top) between musl and glibc environments, and in general, the folks maintaining the musl ecosystem at large don't wish to evolve the environment into a glibc clone.

domenkozar commented 4 years ago

Thanks all for the input. Seems like we still have a long way to go.

teburd commented 3 years ago

To chime in here, I see Nix as a possible Yocto/Alpine alternative for embedded linux and docker image builds, where size is important. Musl being significantly smaller is a very desirable feature when space is a premium and I'd really love to see hydra testing the package set against pkgsStatic and pkgsMusl for a large swath of things. Understandable if NixOS chooses to stick with glibc for desktop/laptop sized systems for the foreseeable future. It's featureful, and well trodden ground.

I'd be happy to contribute my time towards doing so!

stale[bot] commented 3 years ago

I marked this as stale due to inactivity. → More info

Atemu commented 3 years ago

I get the feeling that there isn't enough momentum behind this.

It'd be really cool to have but, given the workload of maintaining an alternative libc brings with it, I don't see proper widespread musl support (much less using it by default) coming any time soon.

yu-re-ka commented 1 year ago

I have made a lot of progress on the pkgsMusl (native musl bootstrap, dynamically linked) package set. One can build a small VM with a desktop environment (sway) and some programs (firefox, thunderbird, wireshark, dino) from my flake.

nix run gitlab:cyberchaoscreatures/musl-nixos?host=cyberchaos.dev#nixosConfigurations.x86_64.config.system.build.vm -- -vga virtio
Thiago-Assis-T commented 1 month ago

is this being worked on/considered? sure would be a nice to have not necessarily build all pkgs with musl, but a option that when enabled woudl build all the packages possible to build with musl

xplshn commented 1 month ago

musl advertises that it is "lightweight, fast, simple, free, correct", but one thing they do not advertise is resistance against exploitation.

glibc has long been in an arms race against hackers, and as new techniques are found for attacking the glibc heap, glibc introduces mitigations against them. A small part of this history is found here:

http://phrack.org/issues/61/6.html http://phrack.org/issues/66/10.html https://github.com/shellphish/how2heap

What mitigations does musl offer to protect users from malicious input attempting to corrupt their processes' heaps? Without this information, we can't tell if switching to musl is a security regression or not.

Note: The XZ-utils backdoor would only work if you had Glibc and Systemd

vcunat commented 1 month ago

Note: that particular backdoor never seemed to work on nixpkgs-built SW anyway.

xplshn commented 1 month ago

Note: that particular backdoor never seemed to work on nixpkgs-built SW anyway.

Ofc not, since it required a patched OpenSSH, just a remark tho

emilazy commented 1 month ago

The backdoor targeted Debian and Fedora running their libc and their version of OpenSSH because those are the most popular, high‐value targets. The OpenSSH patch linking libsystemd rather than hand‐coding the notification may have been sub‐optimal, but it’s a fallacy to think that the attacker would have been stymied and given up if it wasn’t there. There are many other equally‐easy routes it could have used to inject a remote code execution exploit and you can’t pin the blame on the one it used as being especially bad for security without doing actual analysis to show concrete ways in which they meaningfully increase the attack surface compared to the counterfactual alternatives that could have been used. In any case it’s not really on‐topic for this issue.

I don’t have anything against Musl, but I think it’s extremely unlikely it will be the default on NixOS any time soon unless the feature parity and compatibility compared to glibc changes dramatically – maybe we should close this for now?

xplshn commented 1 month ago

unless the feature parity and compatibility compared to glibc changes dramatically

It is clear that the Musl dev team is not interested in adding bugs to Musl. (everything that is non-conformant to the POSIX & ISO C specs is considered a bug. There are some rare exceptions, like wtmp, which is not implemented on Musl because it is insecure)

I think the approach to this should be like Gentoo's. They made their codebase (mostly)compiler-independent & libc-independent, by fixing the things that wouldn't compile under Musl and LLVM, they offer flavors of Gentoo instead of having a default one. Same thing Void Linux does for Musl and Glibc. There is no default.

Nix being a rare case, I think binaries could be compiled against a Musl shipped with Nix (package manager, no OS) itself, which would make Nix and its packages both smaller, and also portable, without necessarily being statically linked. Patchelf exists too, so it could be done the dirty way or the clean way.

Void Linux is the proof that you can unify Glibc and Musl by fixing the non-compliant code in programs, then upstreaming the patches, and then shipping in the repos.

vcunat commented 1 month ago

Sure, fixing all musl-related bugs is possible, probably. Also it's possible to build all binaries for both. The real question is who will be doing all the work (both human and machine/money) and whether it's worth it.

Atemu commented 1 month ago

Feel free to set pkgs = pkgsMusl on your systems then. Nixpkgs supports it.

For anyone building a productive general purpose distro that is supposed to cover a wide range of use-cases OOTB without insane maintenance overhead and constant edge-cases however, they'll be well advised to stick with glibc.
Oh, and anyone who targets a userbase that doesn't only speak the one true great American language because locale support is non-existent in Musl.

I don't like glibc (I don't think anyone really does) but fact of the matter is that it's the most well supported libc on Linux by applications and until that changes, it doesn't make sense for Nixpkgs to default to anything else.

Thiago-Assis-T commented 1 month ago

how and where would i set pkgs = pkgsMusl? i'm at a loss... any docs or source you could reference me to?

V3ntus commented 1 month ago

@Thiago-Assis-T I'm assuming they're referring to this fork of nixpkgs which contains a musl branch. You can check out @yu-re-ka 's flake here for example usage https://cyberchaos.dev/cyberchaoscreatures/musl-nixos

Atemu commented 1 month ago

I'm actually not sure of the mechanism you'd use (perhaps something like _module.args.pkgs = pkgs.pkgsMusl?) but the branch you linked appears to merely contain musl-specific patches that haven't made it upstream yet. Likely useful but not necessary.

YoshiRulz commented 1 month ago

pkgsMusl is part of Nixpkgs proper, like pkgsStatic and pkgsCross.*. You could set nixpkgs.hostPlatform to lib.systems.examples.musl64 to use it system-wide.

nixos-discourse commented 1 month ago

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/rebuild-nixos-with-musl/47655/1

Thiago-Assis-T commented 1 month ago

@YoshiRulz, i'm still having some problems, do you mind hopping over to discourse as to not clutter this issue any further?