purpleprotocol / mimalloc_rust

A Rust wrapper over Microsoft's MiMalloc memory allocator
MIT License
524 stars 44 forks source link

`override` feature is unsafe or broken on many platforms and should be labeled as such #41

Open thomcc opened 4 years ago

thomcc commented 4 years ago

Mimalloc's MI_OVERRIDE feature is being used totally incorrectly for us. We always statically link, meaning it only works on a few unixes (never on windows), and that even then you must arrange for it to be first in the linkers path otherwise you'll end up

I think it should be renamed to unsafe_override or experimental_override and disabled on platforms known to be incomplete (dragonflybsd) or buggy (macos), perhaps with an environment var you can use to force the issue.

jq-rs commented 3 years ago

New 1.7.0 mimalloc tries to fix DragonFly and MacOS part. Nevertheless, when I tried with MacOS, it does not seem to work with rustc combined with libmimalloc-sys, at least. Any ideas what could be still missing?

thomcc commented 3 years ago

Yeah, I contributed some of the macOS fixes. The issue isn't mimalloc (necessarially), it's our use of it. The way to use malloc override on macOS is:

  1. Build an override dylib with the override, zone, and interpose functionality enabled.
  2. Launch your pogram with DYLD_INSERT_LIBRARIES=path/to/libmimalloc.dylib and DYLD_FORCE_FLAT_NAMESPACE=1 variables set in the environment, e.g. env DYLD_INSERT_LIBRARIES=libmimalloc.dylib DYLD_FORCE_FLAT_NAMESPACE=1 cargo build or whatever.

DYLD_FORCE_FLAT_NAMESPACE=1 changes how macOS's symbol resolution algorithm works. By default, macOS "fixes" symbol collisions by having them not apply globally. E.g. if I link against two dylibs that both define a helper function with the same name (and have it be accidentally non-static or whatever), there's no collision on macOS (because the namespace is per-library) as there is on other OSes. However, when overriding malloc, we of course need it to be overridden in all locations. Overriding it just in one library be quite bad, as allocating memory in one place and freeing it in another is common. If you turn on DYLD_FORCE_FLAT_NAMESPACE=1, then there's only one namespace for symbols, and so the override applies globally.

Unfortunately, DYLD_FORCE_FLAT_NAMESPACE=1 tends to break things in strange ways. macOS system libraries don't do well with it on. In firefox we had trouble with it and address sanitizer (no custom malloc involved). This could be the issue you're seeing if you were doing the rest to these steps properly. In general, there isn't a good way to override malloc globally and completely on mac for all programs.

So. You'll note that absolutely none of this is doable from the way we have the override feature setup, which it why I described it as being used totally incorrectly. The way we use it (to build a static library that is linked in) is broken on macOS, (and not just macOS either).

The only way to make this work without setting the environment is probably to detect those vars being missing in main() and if they are missing, set them and execv ourselves, which is highly highly dubious (actually, the DYLD_FORCE_FLAT_NAMESPACE one has a header in the Mach-O file format that turns it on automatically).

thomcc commented 3 years ago

For rustc on macOS my recommendation would be

  1. Don't use any override functionality directly.
  2. Statically link mimalloc into the program, and set a #[global_allocator] using it
  3. Override C++'s operator new/operator delete from exactly one file. That can be done with https://github.com/microsoft/mimalloc/blob/master/include/mimalloc-new-delete.h in one of these: https://github.com/rust-lang/rust/tree/master/compiler/rustc_llvm/llvm-wrapper
  4. Hope that LLVM doesn't assume ::operator new() and malloc are the same allocator. A lot of programs do. They'd probably accept a patch for fixing any case where they do, though.
  5. Optionally, try and convince LLVM to switch from directly calling malloc to calling ::operator new() for easier overriding. (This would be tricky since I doubt they want to give up realloc, but maybe they could be convinced to switch to an overridable llvm_malloc() family of functions...)
  6. Optionally, You might also want to patch a few high-usage places in LLVM that use malloc explicitly to instead instead use ::operator new(), such as llvm::SmallVector (their smallvec doesn't have the same downsides as Rust's, and so they use it for all allocation, e.g. instead of std::vector they use llvm::SmallVector<T, 0>)

I was working on this at one point but its such a hassle. I like a lot of things about macOS's system design but the way userspace memory allocation works is a disaster and feels like its from the 1990s.

jq-rs commented 3 years ago

Thanks for the detailed response! Just wondering, jemalloc-sys in rustc claims that they can override malloc on MacOS. Of course, it may not really work? If it does, however, then maybe mimalloc can do it in a similar way. I'll take a look.

thomcc commented 3 years ago

It does not! https://github.com/gnzlbg/jemallocator/blob/master/jemalloc-sys/build.rs#L36-L45

thomcc commented 3 years ago

Note that realpath mentioned there the least of the issues, and truly just the tip of the iceberg when it comes to weird malloc override sadness on macOS.

(For example, free has to be able to handle being passed pointers allocated out of arbitrary zones. This is one example of it happening in the system libraries, but there are others: https://github.com/opensource-apple/objc4/blob/master/runtime/hashtable2.mm#L62-L68)

jq-rs commented 3 years ago

Uhh, thanks. Rustc comments for config.toml are just wrong then.

thomcc commented 3 years ago

Huh, I had assumed they knew they weren't overriding on macOS. I'll bring that up in t-compiler/performance on zulip.

Edit: https://rust-lang.zulipchat.com/#narrow/stream/247081-t-compiler.2Fperformance/topic/Using.20a.20non-system.20allocator.20for.20rustc.20on.20macOS

jq-rs commented 3 years ago

And BTW, the reason I was looking into this is this test run: https://github.com/rust-lang/rust/pull/81782 Pretty good results, it looks like it would be easy to add a new mimalloc option to config.toml, as it seems MacOS can be ignored. The "override" is used here, seems to work fine for Linux, on proof-of-concept level anyway.

thomcc commented 3 years ago

as it seems MacOS can be ignored

(sad macOS user noises)

But yes, switching rustc to use mimalloc is probably a good idea across the board.

thomcc commented 3 years ago

The "override" is used here, seems to work fine for Linux, on proof-of-concept level anyway.

Also, note that this works iff libmimalloc.a is included first in the list of libraries given to the linker, anything before it will not necessarily use it. (It also is more likely to work if it's done in a single object file which we don't do right now but fairly easily could)

Since you're presumably adding this dep into the actual bin crate (the way jemalloc did), that isn't a problem, but it's not exactly accurate that this means the override feature works on linux without caveats, since we have no real control over the order rustc/cargo/whoever passes libaries when invoking the linker.

jq-rs commented 3 years ago

Sounds to me that it would work on par with jemalloc at the moment. Maybe that is good enough, as it needs to be enabled from the config.toml explicitly. I can push the new option changes (as I already have them available) to the PR for reference and fix the confusing OSX part from the comments at the same time.

thomcc commented 3 years ago

Sounds to me that it would work on par with jemalloc at the moment. Maybe that is good enough, as it needs to be enabled from the config.toml explicitly

Oh yes, it's fine for rustc, I just meant I still felt that it was worth renaming the feature to unsafe_override or something in this repository, so that any project that turns it on knows that some care is required.

jq-rs commented 3 years ago

@thomcc It seems possible to make override working for macOS with a reasonable effort, please see https://github.com/rust-lang/rust/issues/82423. Maybe this can be improved for mimalloc too (no sad macOS user noises after this :))

thomcc commented 3 years ago

In theory mimalloc supports it in specific cases, but not in a general/fully safe way, I'll look into what would be needed for rustc's case, but probably won't get to it until next week or so.

jq-rs commented 3 years ago

Thanks! I am not familiar with macOS internals, but as a macOS user can perhaps at least test drive.

thomcc commented 3 years ago

Do you have access to a pre-10.12 macOS machine?

jq-rs commented 3 years ago

I have the latest Big Sur only, so unfortunately no.

thomcc commented 3 years ago

That's fine, I'll see if I have a machine that's old enough (I think I have one in a box somewhere...). Sadly, a lot of these things changed then, and rustc supports back to 10.7