Closed Ericson2314 closed 5 years ago
The current methods don't return a Result
, how do you propose to reconcile this?
@glandium See the first point: make a bunch of try_*
versions that do. I don't consider this much of a cost, because no plan to support fallible allocation without race conditions and with backwards compat will avoid new methods.
The first point kind of was not clear about what it means.
Where the try_
methods would be defined? In an extension trait? Or change the whole Alloc
trait to add the methods?
I'm not sure how best to go about doing it, but I strongly support Rust adding some support for fallible allocation.
Has there already been discussion about different approaches? Would something like a FallibleAlloc
trait make sense? And Alloc
could be blanket implemented for FallibleAlloc<Error=!>
?
Has there already been discussion about different approaches?
I know there are some discussions floating around on github and internals.rlo on fallible allocation, but I cannot provide a link atm., those threads are pretty long and spread all over the platforms.
Would something like a
FallibleAlloc
trait make sense? AndAlloc
could be blanket implemented forFallibleAlloc<Error=!>
?
To me this makes sense, however I'd name them Alloc<Error = ??>
and InfallibleAlloc
. Then you provide an Alloc
implementation and InfallibleAlloc
is provided for Alloc<Error = !>
.
To me this makes sense, however I'd name them
Alloc<Error = ??>
andInfallibleAlloc
. Then you provide anAlloc
implementation andInfallibleAlloc
is provided forAlloc<Error = !>
.
Yeah, that would probably be preferable. Has that ship already sailed, though? I think I saw discussions about Alloc
being stabilized. Maybe not yet, though.
The alloc
crate has been stabilized. The Alloc
trait is gated behind allocator_api
and is likely to be changed this way or that.
The Alloc
trait (currently unstable) is already fallible: most methods return a Result
.
The try_*
methods mentioned above refer not to methods of the Alloc
trait, but to methods of e.g. collections that internally use heap allocation. For example Vec::try_push
that returns an Err
instead of aborting the process when allocation fails. We have an accepted RFC for this (tracking issue: https://github.com/rust-lang/rust/issues/48043) that proposes try_reserve
and try_reserve_exact
on variable-size collections. This is implemented, but still unstable mostly because of uncertainty around what the error type should be. This RFC discusses options for extending this pattern to other APIs such as try_push
, recognizing that what’s there is a the bare minimum (e.g. you can do vec.try_reserve(1)?; vec.push(x)
) but is not ergonomic, but does not propose a concrete solution.
Separately, for implementing APIs like Vec::push
that are infallible (they don’t return a Result
), code like this is a very common pattern:
let some_layout = Layout::from_size_align(…);
let ptr = some_allocator
.alloc(some_layout)
.unwrap_or_else(|_| std::alloc::handle_alloc_error(some_layout));
To simplify such code, there was discussion in at least https://internals.rust-lang.org/t/pre-rfc-changing-the-alloc-trait/7487 to have APIs like the Alloc
trait (possibly as another trait that can be implemented for the same types, or as a wrapper type) for infallible allocation. The alloc
method would return NonNull<u8>
without a Result
, and that API would call handle_alloc_error
itself on allocation failure.
While adding an Err
associated type in the Alloc
trait is something we can do, it’s not clear to my why that would be interesting. In particular, for the two topics discussed above:
Fallible APIs for collections might want a different error type than the Alloc
trait:
When handling an allocation error, some useful information to have is the layout (or at least the size) of the allocation request that failed. (This is why std::alloc::handle_alloc_error
takes a Layout
parameter.) In Vec::try_push
or HashMap::try_insert
, that information should be part of the error type so that the called doesn’t have to guess at the collection’s allocation strategy. In Alloc::alloc(&mut self, layout: Layout) -> Result<NonNull<u8>, AllocErr>
however, the caller already knows what Layout
value it just passed, and having a zero-size error type is nice as it keeps the entire Result
pointer-sized.
For the ergonomics of implementing infallible collections APIs, Result<T, !>
is barely an improvement over some other Result
. You’d still need .unwrap()
or some future API of Result<T, !>
in order to get at the T
value. So an associated error type does not remove the ergonomic need for an InfallibleAlloc
API.
Fallible APIs for collections might want a different error type than the
Alloc
trait
Allocators may provide additional information, why an allocation failed. For example a stack allocator could return, that the stack is full, and only N
bytes are free to be used.
For the ergonomics of implementing infallible collections APIs,
Result<T, !>
is barely an improvement over some otherResult
. You’d still need.unwrap()
or some future API ofResult<T, !>
in order to get at theT
value.
Result<T, !>::unwrap()
is a noop while Result<T, AllocErr>::unwrap()
requires a branch.
This unwrap
is a noop in optimized machine code, but it still needs to be present in source code so it’s less convenient than an API that returns a bare T
(or doesn’t return).
We instantly get try versions of every method that supports it for free by threading a non-empty error type. The legacy version would just call the non-try version and unwrap the Result<T, !>. Yes we have twice as many methods, but don't have any duplicated logic.
@Ericson2314 Alternatively, instead of duplicating all methods again as try_
, an associated AllocHandle::Err
type allows implementors to provide different handles with different error types. For example, for a given allocator like Jemalloc
, we can provide two handles, a JemallocFallibleHandle
, that properly reports errors, and a JemallocInfallibleHandle
that has Err = !
.
This allows users to pick whatever they need when they need it, without having to hope that collections and such properly use try_
.
@gnzlbg that is basically what I have done :).
I'm almost done with the rebase. I'm going to clean that up and then amend the OP to actually be clear this time!
I feel that this thread is hopelessly fraught with confusion about fallibility of methods of allocator handles (like Alloc::realloc
) v.s. fallibility of methods of collections (like Vec::push
). Should we open separate issues for each of these and make it clear which is which?
@SimonSapin Yes I did confuse people, but they are related in my plan. Basically ever layer should be as polymorphic as possible, kicking the fallibility decision up to the final consumer / root crate.
@Ericson2314 We are not in your head. You need to communicate with people who do not have the context of everything you’re thinking about.
@SimonSapin indeed I do. Does the OP make sense now?
Yes, this is better.
There’s a downside to this approach: since they require different allocator handle types, you have to pick for any given collection one of fallible or infallible allocation. In today’s Nightly Vec::try_reserve
(with an inhabited error type) and Vec::push
are both available on a given vector.
There’s a downside to this approach: since they require different allocator handle types, you have to pick for any given collection one of fallible or infallible allocation.
That's a big downside. I can imagine that one wants to, for a particular collection, handle fallibility in some part of the code, and consider the collection infallible in others.
I don't know how we could solve this. I suppose one option would be some sort of conversions, e.g., allowing let v: Vec<T, InfallibleAlloc> = (v: Vec<T, FallibleAlloc>).into();
(using .into
for exposition only, we can have other methods or traits for this), if the allocators are "convertible" to each other.
I think I wrote some as
and as_mut
methods (could do into
too) to work around that problem.
I don't know how we could solve this.
We solve it by not making fallibility (a relevant) part of the type. Both fallible and infallible methods are always available. The latter are implemented by calling the former, matching, and calling std::alloc::handle_alloc_error
on Err
.
An associated error type is potentially useful if allocator impls want to carry extra information in error values. But IMO making the error type uninhabited is not that useful at all. At the allocator handle level I would prefer to have a convenience wrapper type or separate trait that provides an infallible API signature and takes care of calling handle_alloc_error
.
@SimonSapin No that's a bad plan because then you loose fallibility polymorphism and we're left with the ecosystem split.
[FWIW I'm not even sure how important the mixing use-case is. If I care about some error, why would I panic on it some of the time and handler it other times? I think a better idiom for robust software is to never use AbortAdapter
and manually unwrap
just the Result
s which cannot be handled in a better way. This makes the tech debt clear rather than shoving it under the rug. I don't hold this opinion so strongly that I wouldn't implement the casting methods, but I thought I should share it in hopes that we don't sacrifice something I consider incredibly beneficial like fallibility polymorphism for something I consider barely worth it even without that cost.]
I mean the general rust error handling story is:
unwrap
is bad
If you can't handle the error now, Use ?
to escalate the issue to your caller.
Even though a bunch of ?
and one top level unwrap
, and unwrap
everywhere all have the same behavior, the former is still better for being fewer syntactic uses of unwrap
.
I don't see why allocation failures should be at all different. AbortAdapater
is good vis-à-vis 3, but ?
is still the best. Frankly, I hope most code is already using ?
so adding a variant to your error type for allocation errors isn't that big of a burden. The cases where one uses try_
today become more instances of:
"Oh, I could just escalate this problem upstream but let me try again a different way."
"Let me turn this lower-level error into a higher-level domain-specific error so my caller has an easier time handling this separately."
As already happens with Result
-based error handling.
What is fallibility polymorphism? Why is it important or useful?
@SimonSapin Thanks for asking, I thought I hit this in point 3 in the new OP, but now I'm remembering more nuance from things I wrote down years ago but not yet in this thread.
"fallibility polymorphism" is what I'm calling "allocation failure error type polymorphism". It's the ability to write code, be it collections or "application code" or whatever else with a polymorphic error type, so the code can be used by code that never wants to thinks about allocation failure, code that wants to never panic, and everything in between.
If Rust were going 1.0 in 2021, I'd say the collections should always return Result<_, AllocError>
on all their methods. There should be no handle_alloc_error
, no need for an associated error type for this (though it may still be good for allocator-specific diagnostics), and users not wanting to handle allocation failure should use ?
, which should be ergonomic enough.
But Given that the non-Result
-returning methods do exist, I don't propose that. There are two choices:
No associated error type. Implement the legacy method with try_method(..).unwrap_or_else(|| handle_alloc_error())
.
My plan. Use associated error type and AbortAdapter
. Legacy methods require A: Alloc<Err = !>
.
So, contrary to what I wrote in the OP. collections code reuse is not a unique benefit to my plan. (My apologizes, I'll amend.) But there's still the ecosystem split problem with plan 1.
Let's assume that we have some libraries, all of which use collections / alloc
:
bar
that doesn't care about allocation failure
baz
that wants to audit all panics
foo
that wishes too support a range of uses including bar
and baz
above. However foo
doesn't want to bother defining its every method twice because that's annoying and it's not constrained by backwards compatibility, so it just defines Result<_, Err<..>>
returning ones where E<..>
includes allocation failure and may or may not have type parameters.
(bar
and baz
are basically my two extremes from above, except skipping the farthest extreme of "never panic" since that's easier to audit.)
Here are the problems:
The small problem. bar
doesn't want to use ?
on the methods from foo
because it nowhere-else cares about allocation failure: Maybe it could be cajoled into doing so except it is already happy using the non-Result
-returning methods from collections and doesn't want to change. unwrap
works but looks bad, and maybe you unwrap
to much by accident.
Solution: Result<_, !>
can have a special total unwrap
(like the extension methods in the void
package) that is always safe to use. bar
can use that instead to ignore allocation errors while being confident it isn't ignoring anything else.
The big problem: baz
wants to use foo
(and collections
) but is worried about itself or some dependency using the fallible methods by mistake. If the legacy methods are always callable, it's too easy to use them by mistake.
Solution: With the AbortAllocator
plan, the legacy functions can require A = AbortAllocator<A1>
. This means the methods cannot be so easily called, and it's easier for baz
audit for the single pair of casting functions. (In itself and foo
.)
Extra Credit: If the methods require AbortAllocator<Global>
then if foo
exports an alloccator-polymorphic interface baz
can be confident that foo
doesn't use the legacy methods because it is impossible to cast from A
to AbortAllocator<Global>
due to parametricity. It doesn't even need to audit foo
's code for this (though it still should for unsafe tricks in general, including those that could get around this), but just it's interface to unsure the cast methods aren't use.
If Rust were going 1.0 in 2021, I'd say the collections should always return
Result<_, AllocError>
on all their methods.
I think this proposal is a lot more controversial than you seem to assume.
users not wanting to handle allocation failure should use
?
, which should be ergonomic enough.
It’s really not. Most code is in functions that do not return a Result
. In functions that do, the error type might not implement From<AllocErr>
.
users not wanting to handle allocation failure should use
?
, which should be ergonomic enough.It’s really not. Most code is in functions that do not return a
Result
. In functions that do, the error type might not implementFrom<AllocErr>
.
Unless I'm misunderstanding @Ericson2314, I think a more clear way to phrase the original suggestion might be something like "library code that is ambivalent about allocation failure should use ?, which should be ergonomic enough." That library code can impl From<AllocErr>
for its error type. The user code that doesn't care about allocation failure should probably just unwrap
.
Forcing everyone to sprinkle unwrap
everywhere when they want to use Vec
is not “ergonomic enough”. (Same with an hypothetical Result::<T, !>::unwrap_never
that doesn’t panic, or whatever its name. It still needs to be called.)
Remember that I’m responding to this, which seems to be the premise for everything else here:
If Rust were going 1.0 in 2021, I'd say the collections should always return
Result<_, AllocError>
on all their methods.
@SimonSapin Rust is not pre-1.0, and so I'm not proposing getting rid of those methods. It's not the premise for everything else; in fact it's the opposite. After defining "fallibility polymorphism", I realized it's not the only way to avoid splitting the ecosystem; a single AllocErr
is fine *if all allocating methods Result<T, AllocErr>`. I wanted to come clean and say it's not the only way.
The example situation below the line is suppose to show very explicitly why no error polymorphism / fallibility polymorphism in the presence of those methods is no good. If you like, skip the top part and read that.
@scott-maddox Yeah that is a lot closer to what I meant. The whole question of splitting or not splitting the ecosystem hinges on what libraries can do to support both users cases. liballoc
collections must support both sorts of methods due to backwards comparability, but it's a lot to expect other libraries to also do so. That's a lot more of a nuisance all the other things mentioned so far.
Even user code using unwrap
is no good, though (that is problem 1 in my example below the line).
It's way to easy to below up other errors besides allocation failure. But with error isomorphism, libraries like bar
would have methods would return Result<_, LibraryErrorType<A::Err>>
, and user code not caring would use AbortAllocator
so that they end up with a Result<_, LibraryErrorType<!>>
. Now there's effectively no allocation. failure left.
(Same with an hypothetical
Result::<T, !>::unwrap_never
that doesn’t panic, or whatever its name. It still needs to be called.)
OK first of all that's still way better than unwrap
because there's no risk on panicking on more than allocation failure. Ergonomics is about maintenance not just banging out the first draft.
That said, the situation is still far better than even one unwrap_never
per every allocating Result
method. Picking up where I left off responding to @scott-maddox, one only needs to use unwrap_unsafe
once in a impl From<LibraryErrorType<!>> for UserErrorType
. which is otherwise the exact same from impl they would have written anyways (impl From<LibraryErrorType> for UserErrorType
) had the library panicked on allocation failure too.
I just want to note that the main reason we have a std::prelude::v1
is to be able to do breaking changes to the std lib API if we need to. That is, we could add a prelude::v2
with incompatible collection types (and a compatibility shim), that, e.g., makes the allocation methods of the collections return Result
.
This would be obviously very painful, e.g.,
// crate A - edition2018
fn foo() -> Vec<i32>;
// crate B - edition20XY prelude::v2::*
fn bar() {
let x: Vec<i32> = A::foo(); // ERROR: type mismatch: v2::Vec<i32> != v1::Vec<i32>
}
probably as painful as upgrading a dependency graph from futures
0.1 to futures
0.3. And as @SimonSapin argues, making all allocation methods return Result
impacts ergonomics, so AFAICT it isn't a clear win.
But if we can't resolve #2 to make type parameters unstable, we might actually have to move the collections to a prelude::v2
anyways (and hopefully just have prelude::v1
expose type Vec<T> = v2::Vec<T, alloc::Global>;
but maybe that doesn't work either and we will end up with incompatible types).
The prelude module only contains pub use
reexports. Vec
is defined in the std::vec
module.
I think that std::new_vec::Vec
that behaves differently from std::vec::Vec
would be confusing. If we go with incompatible base types I think they should have a different “base” name like std::vec::NewVec
, so that they’re distinguishable even after being imported into a given scope. Swapping the meaning of a given name in the prelude would be especially bad IMO.
to be able to do breaking changes to the std lib API if we need to
This is at best very misleading phrasing.
This is at best very misleading phrasing.
Sorry, breaking changes to the prelude that users get by default would probably be more accurate.
@gnzlbg Your zulip thread seems positive so far so I hope #2 works out. Even if we would somehow remove those methods it wouldn't be a breaking change to Vec
itself.
Before this thread meanders somewhere else, does my example above in https://github.com/rust-lang/wg-allocators/issues/10#issuecomment-489763190 about ecosystem splitting make sense?
Closing for #23 because this thread is indeed messy.
Following up from https://rust-lang.zulipchat.com/#narrow/stream/197181-t-libs.2Fwg-allocators/topic/Collections.20beyond.20box.2C.20allocation.20fallibility . I've done a preliminary rebase of https://github.com/rust-lang/rust/pull/52420 at https://github.com/QuiltOS/rust/tree/allocator-error (old versions are https://github.com/QuiltOS/rust/tree/allocator-error-old, https://github.com/QuiltOS/rust/tree/allocator-error-old-1, etc.)
If we add an associated
Err
type toAlloc
, we can set it to!
to indicate allocation is infallible (panics, aborts, etc). Allocators should be written with a non-emptyAlloc::Err
, but aAbortAdapter<A>
is provided which callshandle_alloc_error
sofor<A: Alloc> <AbortAdapter<A> as Alloc>::Err = !
.This has many benefits.
We get
try_
and traditional versions of every collections method that supports fallibility it for no extra cost! This is done by threading the abstract error type. We don't need to duplicate the original method, but can simply do:The legacy version would just call the non-
try_
version and panic-free unwrap theResult<T, !>
. Yes we have twice as many methods, but don't have any duplicated logic.Any container (or individual method on the container) that can not handle fallible allocation can require
Err = !
so as not to hold back the alloc-parameterization and fallible-allocation-recovery of the other containers. I.e. this proposal reduces the risk of refactoring the collection in that there's no all-or-nothing bifurcation of outcomes.Polymorphism all the way up. Just as collections are fallibility-polymorphic with the
try_methods
(perhaps they should be renamed to indicate the are not always fallible), other crates using collections can also useA::Err
to also be fallibility-polymorphic. This avoids an ecosystem split between those that want and don't want to manually deal with allocation failure.Note that this seems to impose a downside of each collection having to commit to a fallibility up front. But since
AbortAdapter
can be#[repr(transparent)]
, we can freely cast/borrow between the two.Original unclear version for posterity:
If we add an associated
Err
type toAlloc
, we can set it to!
to indicate allocation is infallible (panics, aborts, etc). This has two purposes:We instantly get
try_
versions of every method that supports it for free by threading a non-empty error type. The legacy version would just call the non-try_
version and unwrap theResult<T, !>
. Yes we have twice as many methods, but don't have any duplicated logic.Any container (or individual method on the container) that can not handle fallible allocation can require
Err = !
so as not to hold back the alloc-parameterization and fallible-allocation-recovery of the other containers.