Open RalfJung opened 1 year ago
I think one solution here would be to decide that System
is just always whatever we decide the default global allocator is (and perhaps it should be renamed to DefaultGlobal
or something). The name System
is already fairly misleading, given:
it is not valid to mix use of the backing system allocator with
System
, as this implementation may include extra work, such as to serve alignment requests greater than the alignment provided directly by the backing system allocator.
And also given that (IIUC) it's actually a vendored dlmalloc port on WebAssembly targets
So basically you are saying the second docs I quoted are right and the first one should be removed?
That's my opinion, yes.
I think we shouldn't assume #[global_allocator] static A: System = System;
being the default in miri. If you depend on System
being the global allocator you should add #[global_allocator] static A: System = System;
yourself and if you don't add it, someone else is allowed to use a different global allocator, which would then cause UB for you. In that case you are at fault, not the crate that specified a different global allocator.
Yeah, that's a pretty compelling point. While I do feel like System
should be guaranteed the default global allocator (and probably renamed), it's very hard to rely on that without UB.
Miri runs binary crates, not lib crates, so it's not possible for someone else to set anything.
If the documentation says that Rust works as if this exists by default
#[global_allocator]
static A: System = System;
then we have to allow such code, we have no grounds for calling it UB. So @bjorn3 are you saying we should remove the 2nd documentation and keep the first?
Miri runs binary crates, not lib crates, so it's not possible for someone else to set anything.
Miri is often used to test library crates.
So @bjorn3 are you saying we should remove the 2nd documentation and keep the first?
Yes.
Miri is often used to test library crates.
Yes -- by running a binary crate that exercises them (the test crate). What I said remains true.
@bjorn3 I disagree with this, because it means there's no way to "wrap" the default allocator.
I should be able to write something like this:
#[global_allocator]
static A: MyAllocationTracker = MyAllocationTracker::new(System);
And have it track my allocations without otherwise changing the behaviour of my program.
If you think System
is not the right type for this (despite it being explicitly documented as the right way to do this) then we need a new DefaultGlobal
type instead.
And have it track my allocations without otherwise changing the behaviour of my program.
It shouldn't change the behavior to switch from the default allocator to the system allocator. At most the performance profile. If we decided that the System
allocator is always the default allocator, then MyAllocationTracker::new(System)
would not cause any perf regression other than inherent to MyAllocationTracker
. If we decided to use a different default allocator, we did likely do so to improve performance. In that case using MyAllocationTracker::new(System)
would be a perf regression, but not beyond what we would have anyway by deciding to only use System
as default allocator. If we decide to only use System
as default, maybe should rename System
itself to DefaultGlobal
and remove any relation to the actual system allocator?
FTR, System
is stable and cannot be renamed. If that path was taken, a new type entirely would have to be created.
It shouldn't change the behavior to switch from the default allocator to the system allocator.
In practice it can. We need a way to get precisely the default behaviour, not just "approximately". Whether we do that by upholding the guarantee that System is the default, or by introducing a new DefaultGlobal
I don't really care,, but if we're going to do the latter, then the sooner the better.
I'm unsure you could rely upon that. Making it generally unspecified would allow, for example, delegating to C++ operator new (which I contemplated doing for proc-macros back when lccc was written in C++ - it switched to just pulling in xlang_allocate via custom allocator).
On Sat, Nov 19, 2022 at 10:59 bjorn3 @.***> wrote:
And have it track my allocations without otherwise changing the behaviour of my program.
It shouldn't change the behavior to switch from the default allocator to the system allocator. At most the performance profile. If we decided that the System allocator is always the default allocator, then MyAllocationTracker::new(System) would not cause any perf regression other than inherent to MyAllocationTracker. If we decided to use a different default allocator, we did likely do so to improve performance. In that case using MyAllocationTracker::new(System) would be a perf regression, but not beyond what we would have anyway by deciding to only use System as default allocator. If we decide to only use System as default, maybe should rename System itself to DefaultGlobal and remove any relation to the actual system allocator?
— Reply to this email directly, view it on GitHub https://github.com/rust-lang/wg-allocators/issues/108#issuecomment-1320913823, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABGLD2YHNOJVPG7JYKGHOR3WJD2NJANCNFSM6AAAAAASFHO4YI . You are receiving this because you are subscribed to this thread.Message ID: @.***>
(despite it being explicitly documented as the right way to do this) t
It is also explicitly document as not being something you can rely on, so you really can't go off the docs here, unfortunately.
FTR, System is stable and cannot be renamed. If that path was taken, a new type entirely would have to be created.
Ah, my bad. I thought this was covered by the same feature as Global
. That's unfortunate but it could just be a change of documentation, TBH.
(Not saying it's necessarily a good idea, but) If we really want to change the name of a stable type, we can move the type and expose a (potentially deprecated) type alias and const
from the old location.
For a unit struct like System
, I believe these two items fully alias the functionality.
That said, I personally think that Global
allocation (and the free functions) should not be interchangeable with whatever backs the global allocator. If we allow the AM to replace Global
allocation with its own allocation, then properties provided by the concrete allocator (such as overallocation) aren't provided by Global
when this happens.
You mean, even if the global allocator is explicitly set one cannot allocate via Global and then free via the underlying allocator?
Hm yeah that makes sense. We want to treat Global special for optimizations.
Just for the sake of considering all options, an alternative to this:
That said, I personally think that Global allocation (and the free functions) should not be interchangeable with whatever backs the global allocator.
Would be to say that the "special behaviour" (eg. that we can optimize away allocations) is a behaviour imparted by the #[global_allocator]
attribute. WIth this approach the alloc
/dealloc
calls just call into the global allocator with no special behaviour, and could be interchanged, and direct use of the corresponding static variable would also benefit from these optimizations.
(To be clear this is completely tangential to my argument about having some way to access the default allocator)
and direct use of the corresponding static variable would also benefit from these optimizations.
The special behavior comes from LLVM recognizing __rust_alloc
and __rust_dealloc
. If you directly call through the #[global_allocator]
static, this won't trigger. You have to go through __rust_alloc
/__rust_dealloc
using std::alloc::Global
.
@bjorn3 That's an implementation detail though. You could in theory export the methods on the #[global_allocator]
as __rust_alloc
and __rust_dealloc
, making those special to LLVM and the global alloc/dealloc functions just normal functions.
That would mean the methods on a static of type T are not the methods from that type T. Seems rather strange to me?
In fact it's not even coherent...
#[global_allocator]
static S: MyAlloc = MyAlloc::new();
fn mk_alloc(s: &MyAlloc) {
s.allocate(...);
}
fn main() {
mk_alloc(&S);
}
mk_alloc
obviously jumps to the user-written method on MyAlloc
without any __rust_alloc
magic. And hence when S
is passed to it, we do call the global_allocator static, but it's not possible to route that call through __rust_alloc
.
I think best resolution to the documentation issue is something along the lines of
wasm
targets are really scuffed. Some point in the (far) future, define a wasm32-canon
target or similar using whatever the eventual stabilization of the wasm component model specifies.System
.System
as a minimalistic (but unspecified) wrapper around the host's allocation functions
wasm-unknown-unknown
's std providing a vendored allocator for System
is a historical accident resulting from the practical compromises made to ship the Wasm MVP, similar to everything else in std
which is stubbed out.wasm32-canon
's System
will utilize the portability lint to not be defined, unless the component model defines a standard provider for realloc
. (Given this would cross component boundaries, this seems highly unlikely.) When targeting wasm32-canon
, a #[global_allocator]
must be defined.#[global_allocator]
is injected into the root standalone crate types (i.e. bin
, staticlib
, cdylib
) if not provided by library code.#[global_allocator]
cannot be used with staticlib
/cdylib
, document that as such. (Them thus using the default System
allocator can still be stated explicitly, but as a result from that restriction, not a fact of its own.)#[global_allocator]
; tracked in the wg-alloc repo.Make it clearer how the default #[global_allocator] is injected into the root standalone crate types (i.e. bin, staticlib, cdylib) if not provided by library code.
I would like for there to additional rules arround linking multiple standalone crates together. In particular, if you link a staticlib
or cdylb
to a bin
or cdylib
it should be unspecified whether global_allocator calls use the one provided in the downstream standalone crate (if any) or the allocator provided (additionaly, IFNDR if a staticlib explicitly defines a global_allocator and is linked into a bin
that also explicitly defines a global_allocator) and provide the reverse rule for binary crates. This is independent of whether or not #[global_allocator]
is allowed at all in staticlib
/cdylib
crates (I have no preference here, though I would lean against allowing it for staticlib
s specifically).
As an alternative, saying that it must use the top-level binary's #[global_allocator]
if at all specified is perfectly fine for me (as this is the function of the current implementation), though I suspect that rustc would have some issues with it.
I have need of this rule for lccc in order to be able to allow linking to staticlib crates from rust final link targets, or to link together multiple staticlib crates into a single final link target written in any language, as well as to permit overriding default allocator when dynamically linking std (except maybe on windows). The former is an absolute requirement for my use cases (standard system libraries written in rust) and the latter a QoL improvement since I want to dynamically link std by default on capable targets.
This is because of how the allocator is implemented (a single weak symbol in a COMDAT group, referenced by liballoc and defined in libstd+#[global_allocator]
, the symbol is std::alloc::__global_allocator
mangled).
@CAD97 I suppose one option would be for alloc
crate (or core
and move the trait) to expose its own generic implementation of the Allocator
trait which (at least pretends to) implement the magic elision behaviour, eg.
struct OpaqueAllocator<T>(pub T);
impl<T: Allocator> Allocator for OpaqueAllocator<T> { ... }
Then it can simply be part of the contract of this type that it may not call the underlying allocator, or may call it multiple times, allowing the compiler to optimize allocations through this type. The System
allocator can already be wrapped in this type, since it's not observable anyway.
@Diggsey core
is already the crate that defines the Allocator
and GlobalAlloc
traits. The alloc
trait adds the assumption of a #[global_allocator]
and defines the safe types that use that global allocator. In the future with a stable Allocator
trait, the alloc
crate would be just System
/Global
and a bunch of type aliases to the collections defined in core which add a default parameter to provide A = Global
.
Having a wrapper std::alloc::Removable<A>
does seem like a reasonable way to model allocation elision with minimal issues, though it's still somewhat unfortunate to add more wrappers around what's very hot code, and without generic sharing, this leads to the easy way of doing things getting you monomorphized everywhere (because it's generic) rather than using a single instantiation. (You can still recover that property, but it requires defining two impl Allocator
; one for the actual allocation work, and one just to wrap Removable<MyAllocator>
.)
This would work for removing allocations, but the other code motion (adding, reordering) aren't possible to describe just with a wrapper selecting if it's going to call your impl or not. Perhaps we don't need these, but it should be noted that
@chorman0773 staticlib
is defined to be a fully bundled static library with no dynamic dependencies (weak or otherwise) on other Rust code. If you want to link together multiple staticlib crates, the correct crate type is rlib
or a custom lccc-staticlib
, not staticlib
. (Given we have dylib
/cdylib
, perhaps it would've been better if this crate type were cstaticlib
.) Encapsulating whatever implementation strategy for #[global_allocator]
inside the staticlib
output should not be difficult; all you (should) need to do is not export its symbol, such that it only has internal linkage and is fully resolved within the staticlib
.
There's open discussion around a staticlib-nobundle
or stable (subset of) rlib
such that its possible to link together static Rust libraries without duplicating symbols (ignoring monomorphization).
As for dynamic linking, similarly, cdylib
is defined to be a bundle with no dependencies on other Rust code. If you want to do dynamic linking of std, the correct crate type for that is dylib
(or a custom lccc-dylib
), not cdylib
.
Even if lccc wants to define dylib
/rlib
to be the same as cdylib
/staticlib
, there's still the bundling difference. And that bundling difference applies to all globals within the bundled build, whether static
s or #[global_allocator]
; multiple bundles linked together must have fully distinct worlds.
staticlib is defined to be a fully bundled static library with no dynamic dependencies (weak or otherwise) on other Rust code. If you want to link together multiple staticlib crates, the correct crate type is rlib or a custom lccc-staticlib, not staticlib. (Given we have dylib/cdylib, perhaps it would've been better if this crate type were cstaticlib.)
The issue is that what I want to do is compile a system library (such as libcrypto) and then have that available to link into, including from rust code, via the standard interface expected of any libcrypto implementation (such as the one from openssl, libressl, etc.). The fact that if I do this I cannot link that version of the library to a rust crate means I do not comply with the standard definition of the library. Same for cdylib, I want to be able to produce system libraries using this type.
Encapsulating whatever implementation strategy for #[global_allocator] inside the staticlib output should not be difficult; all you (should) need to do is not export its symbol, such that it only has internal linkage and is fully resolved within the staticlib.
This is non-functional wrt. staticly linked libraries: There is no such thing as a symbol that can be linked by all objects of an archive and no other object in the same link step, at least not in ELF, COFF, or o65 objects (or OMF, a.out, xo65, and a number of others when using GNU or BSD archive format). Defining it as internal would limit it to being available only in the object file that defines it, and defining it as external or weak makes it available to all objects in the entire ld invocation. Even with 1 cgu, the allocator will be depended upon by upstream crates, so the definition must be external.
The issue with cdylib is that to support linking against a dynamically linked std, all providers of the alocator symbol are defined with default visibility as external. Additionally, to support MacOS and static linking of std, the symbol is defined as weak in a linkonce COMDAT group. The default allocator is just the std::alloc::__global_alloc
definition inherited from libstd.
And that bundling difference applies to all globals within the bundled build, whether statics or #[global_allocator]; multiple bundles linked together must have fully distinct worlds.
Again, this is impossible on existing file formats using the standard archive for static libraries.
If the standard interface expected of any libcrypto implementation only has it importing symbols from libc, then you simply cannot match that with a Rust crate without entirely encapsulating the use of Rust such that no Rust symbols are imported or exported. You can't just add extra imported symbols (even if they're weak symbols) and claim that it provides the exact same interface.
You're making your own compiler. Make your own crate type which does what you need it to. This is far from the locale to argue about lccc trying to force existing compiler options to do things they were not designed to do.
If the standard interface expected of any libcrypto implementation only has it importing symbols from libc, then you simply cannot match that with a Rust crate without entirely encapsulating the use of Rust such that no Rust symbols are imported or exported.
The standard interface is that it is usable. It's already assumed (the openssl-sys
crate) that you can import your system libcrypto and libssl in rust.
You're making your own compiler. Make your own crate type which does what you need it to.
My assumption was that the cdylib and staticlib options, that is the options that are designed to produce system library types, produced files usable as system libraries. I was also implicitly assuming that I can write my rust code against the standard options expected of the rust compiler, and not have to limit a generally useful crate (lc-crypto) to being "lccc only", that seems highly counterproductive to the whole enterprise. I would further expect in general to be able to link any system library into either a rust program or a program that also links in other rust code, or to be able to combine system libraries, many of which are written in rust.
This is far from the locale to argue about lccc trying to force existing compiler options to do things they were not designed to do.
No - what I'm doing here is trying to ensure I have the proper guarantees to actually implement the language wrt. the global_allocator which is up for discussion here, while working arround a problem that prevents me from writing low-level system code in a systems programming language. And that I'm not given encapsilation requirements that are impossible to implement on existing file formats.
@chorman0773 @CAD97 you are veering widely off-topic for this issue; whether or not the allocator is shared when linking together crate type X is not what this thread is about.
Could someone with moderation power please mark these posts as off-topic?
This would work for removing allocations, but the other code motion (adding, reordering) aren't possible to describe just with a wrapper selecting if it's going to call your impl or not. Perhaps we don't need these, but it should be noted that
It could be a wrapper that applies the full set of allocation magic though -- that would be a language primitive, sure, but instead of saying that the magic elision/introduction of allocations is tied to the global allocator, we could tie it to that wrapper. That doesn't seem any harder to specify? And it would make the same magic available to custom allocators.
Pragmatically speaking, we'd add the appropriate LLVM attributes to calls of the functions on this magic wrapper.
I quite like the suggestion of having Global
be effectively be equivalent to SpookyAllocatorBikeshedMagic<System>
-- specifying the existence of the SpookyAllocatorBikeshedMagic
wrapper allows us to explain where that behavior will/won't occur in a clean and predictable way, rather than the status quo where the lack of it actually makes specifying the semantics around allocators hard in many cases.
The far-future wasm32-canon's System will utilize the portability lint to not be defined, unless the component model defines a standard provider for realloc. (Given this would cross component boundaries, this seems highly unlikely.) When targeting wasm32-canon, a #[global_allocator] must be defined.
As I understand it the canonical abi of the component model requires exposing methods that allocate within the component itself. As such any component would either need #[global_allocator]
or rust would need to provide a default allocator.
Make it clearer how the default #[global_allocator] is injected into the root standalone crate types (i.e. bin, staticlib, cdylib) if not provided by library code.
I would like for there to additional rules arround linking multiple standalone crates together. In particular, if you link a
staticlib
orcdylb
to abin
orcdylib
it should be unspecified whether global_allocator calls use the one provided in the downstream standalone crate (if any) or the allocator provided (additionaly, IFNDR if a staticlib explicitly defines a global_allocator and is linked into abin
that also explicitly defines a global_allocator) and provide the reverse rule for binary crates. This is independent of whether or not#[global_allocator]
is allowed at all instaticlib
/cdylib
crates (I have no preference here, though I would lean against allowing it forstaticlib
s specifically).As an alternative, saying that it must use the top-level binary's
#[global_allocator]
if at all specified is perfectly fine for me (as this is the function of the current implementation), though I suspect that rustc would have some issues with it.I have need of this rule for lccc in order to be able to allow linking to staticlib crates from rust final link targets, or to link together multiple staticlib crates into a single final link target written in any language, as well as to permit overriding default allocator when dynamically linking std (except maybe on windows). The former is an absolute requirement for my use cases (standard system libraries written in rust) and the latter a QoL improvement since I want to dynamically link std by default on capable targets. This is because of how the allocator is implemented (a single weak symbol in a COMDAT group, referenced by liballoc and defined in libstd+
#[global_allocator]
, the symbol isstd::alloc::__global_allocator
mangled).
Cdylibs and staticlibs shouldn't leak anything about being a rust crate. That includes the global allocator. For cdylibs we indeed only export #[no_mangle]
functions and don't import anything other than extern "C" { ... }
. So #[global_allocator]
doesn't leak through the cdylib boundary. For staticlib we unfortunately don't limit symbol exports at the moment, but we need to fix that anyway as it breaks including a rust staticlib in a rust program due to symbol conflicts. For dynamically linking libstd you must use the dylib crate type, not the cdylib crate type.
Please move the symbol leakage discussion to a different issue, will you? It is independent of what we guarantee for default crate types, which this issue is about. The only reason crate types show up in the OP is that I quoted the docs with context. But it's really a separate question. Looks like some people here are way too easily nerd-sniped... (I feel you, I often have the same problem. It's bad for discussions though.)
Who has moderation rights here and can collapse those off-topic posts?
Documentation states that
and it also states that
These two statements directly contradict each other -- one says that
System
is the default everywhere, one says that that is not specified. Which one should it be? :)FWIW Miri currently considers it a bug to create an allocation with the default global allocator and then free it with
System
, matching the first bit of documentation, but contradicting the second.