Fallible allocation (via Associated Error type)

Ericson2314 commented 5 years ago

Following up from https://rust-lang.zulipchat.com/#narrow/stream/197181-t-libs.2Fwg-allocators/topic/Collections.20beyond.20box.2C.20allocation.20fallibility . I've done a preliminary rebase of https://github.com/rust-lang/rust/pull/52420 at https://github.com/QuiltOS/rust/tree/allocator-error (old versions are https://github.com/QuiltOS/rust/tree/allocator-error-old, https://github.com/QuiltOS/rust/tree/allocator-error-old-1, etc.)

If we add an associated Err type to Alloc, we can set it to ! to indicate allocation is infallible (panics, aborts, etc). Allocators should be written with a non-empty Alloc::Err, but a AbortAdapter<A> is provided which calls handle_alloc_error so for<A: Alloc> <AbortAdapter<A> as Alloc>::Err = !.

This has many benefits.

We get try_ and traditional versions of every collections method that supports fallibility it for no extra cost! This is done by threading the abstract error type. We don't need to duplicate the original method, but can simply do:
```
struct Collection<T, A: Alloc = AbortAdaptor<Global>> { ... }

impl<T, A> Collection<T, A> {
  fn try_foo(&self) -> Result<T, A::Err> {
    ...
  }
}

impl<T> Collection<T> { // A::Err = !
  fn foo(&self) -> T {
    let Ok(x) = self.try_foo();
    x
  }
}
```
The legacy version would just call the non-try_ version and panic-free unwrap the Result<T, !>. Yes we have twice as many methods, but don't have any duplicated logic.
Any container (or individual method on the container) that can not handle fallible allocation can require Err = ! so as not to hold back the alloc-parameterization and fallible-allocation-recovery of the other containers. I.e. this proposal reduces the risk of refactoring the collection in that there's no all-or-nothing bifurcation of outcomes.
Polymorphism all the way up. Just as collections are fallibility-polymorphic with the try_methods (perhaps they should be renamed to indicate the are not always fallible), other crates using collections can also use A::Err to also be fallibility-polymorphic. This avoids an ecosystem split between those that want and don't want to manually deal with allocation failure.

Note that this seems to impose a downside of each collection having to commit to a fallibility up front. But since AbortAdapter can be #[repr(transparent)], we can freely cast/borrow between the two.

Original unclear version for posterity:

If we add an associated Err type to Alloc, we can set it to ! to indicate allocation is infallible (panics, aborts, etc). This has two purposes:

We instantly get try_ versions of every method that supports it for free by threading a non-empty error type. The legacy version would just call the non-try_ version and unwrap the Result<T, !>. Yes we have twice as many methods, but don't have any duplicated logic.
Any container (or individual method on the container) that can not handle fallible allocation can require Err = ! so as not to hold back the alloc-parameterization and fallible-allocation-recovery of the other containers.

glandium commented 5 years ago

The current methods don't return a Result, how do you propose to reconcile this?

Ericson2314 commented 5 years ago

@glandium See the first point: make a bunch of try_* versions that do. I don't consider this much of a cost, because no plan to support fallible allocation without race conditions and with backwards compat will avoid new methods.

glandium commented 5 years ago

The first point kind of was not clear about what it means.

TimDiekmann commented 5 years ago

Where the try_ methods would be defined? In an extension trait? Or change the whole Alloc trait to add the methods?

scottjmaddox commented 5 years ago

I'm not sure how best to go about doing it, but I strongly support Rust adding some support for fallible allocation.

Has there already been discussion about different approaches? Would something like a FallibleAlloc trait make sense? And Alloc could be blanket implemented for FallibleAlloc<Error=!>?

TimDiekmann commented 5 years ago

Has there already been discussion about different approaches?

I know there are some discussions floating around on github and internals.rlo on fallible allocation, but I cannot provide a link atm., those threads are pretty long and spread all over the platforms.

Would something like a FallibleAlloc trait make sense? And Alloc could be blanket implemented for FallibleAlloc<Error=!>?

To me this makes sense, however I'd name them Alloc<Error = ??> and InfallibleAlloc. Then you provide an Alloc implementation and InfallibleAlloc is provided for Alloc<Error = !>.

scottjmaddox commented 5 years ago

To me this makes sense, however I'd name them Alloc<Error = ??> and InfallibleAlloc. Then you provide an Alloc implementation and InfallibleAlloc is provided for Alloc<Error = !>.

Yeah, that would probably be preferable. Has that ship already sailed, though? I think I saw discussions about Alloc being stabilized. Maybe not yet, though.

TimDiekmann commented 5 years ago

The alloc crate has been stabilized. The Alloc trait is gated behind allocator_api and is likely to be changed this way or that.

SimonSapin commented 5 years ago

The Alloc trait (currently unstable) is already fallible: most methods return a Result.

The try_* methods mentioned above refer not to methods of the Alloc trait, but to methods of e.g. collections that internally use heap allocation. For example Vec::try_push that returns an Err instead of aborting the process when allocation fails. We have an accepted RFC for this (tracking issue: https://github.com/rust-lang/rust/issues/48043) that proposes try_reserve and try_reserve_exact on variable-size collections. This is implemented, but still unstable mostly because of uncertainty around what the error type should be. This RFC discusses options for extending this pattern to other APIs such as try_push, recognizing that what’s there is a the bare minimum (e.g. you can do vec.try_reserve(1)?; vec.push(x)) but is not ergonomic, but does not propose a concrete solution.

Separately, for implementing APIs like Vec::push that are infallible (they don’t return a Result), code like this is a very common pattern:

let some_layout = Layout::from_size_align(…);
let ptr = some_allocator
    .alloc(some_layout)
    .unwrap_or_else(|_| std::alloc::handle_alloc_error(some_layout));

To simplify such code, there was discussion in at least https://internals.rust-lang.org/t/pre-rfc-changing-the-alloc-trait/7487 to have APIs like the Alloc trait (possibly as another trait that can be implemented for the same types, or as a wrapper type) for infallible allocation. The alloc method would return NonNull<u8> without a Result, and that API would call handle_alloc_error itself on allocation failure.

While adding an Err associated type in the Alloc trait is something we can do, it’s not clear to my why that would be interesting. In particular, for the two topics discussed above:

Fallible APIs for collections might want a different error type than the Alloc trait:

When handling an allocation error, some useful information to have is the layout (or at least the size) of the allocation request that failed. (This is why std::alloc::handle_alloc_error takes a Layout parameter.) In Vec::try_push or HashMap::try_insert, that information should be part of the error type so that the called doesn’t have to guess at the collection’s allocation strategy. In Alloc::alloc(&mut self, layout: Layout) -> Result<NonNull<u8>, AllocErr> however, the caller already knows what Layout value it just passed, and having a zero-size error type is nice as it keeps the entire Result pointer-sized.
For the ergonomics of implementing infallible collections APIs, Result<T, !> is barely an improvement over some other Result. You’d still need .unwrap() or some future API of Result<T, !> in order to get at the T value. So an associated error type does not remove the ergonomic need for an InfallibleAlloc API.

TimDiekmann commented 5 years ago

Fallible APIs for collections might want a different error type than the Alloc trait

Allocators may provide additional information, why an allocation failed. For example a stack allocator could return, that the stack is full, and only N bytes are free to be used.

For the ergonomics of implementing infallible collections APIs, Result<T, !> is barely an improvement over some other Result. You’d still need .unwrap() or some future API of Result<T, !> in order to get at the T value.

Result<T, !>::unwrap() is a noop while Result<T, AllocErr>::unwrap() requires a branch.

SimonSapin commented 5 years ago

This unwrap is a noop in optimized machine code, but it still needs to be present in source code so it’s less convenient than an API that returns a bare T (or doesn’t return).

gnzlbg commented 5 years ago

We instantly get try versions of every method that supports it for free by threading a non-empty error type. The legacy version would just call the non-try version and unwrap the Result<T, !>. Yes we have twice as many methods, but don't have any duplicated logic.

@Ericson2314 Alternatively, instead of duplicating all methods again as try_, an associated AllocHandle::Err type allows implementors to provide different handles with different error types. For example, for a given allocator like Jemalloc, we can provide two handles, a JemallocFallibleHandle, that properly reports errors, and a JemallocInfallibleHandle that has Err = !.

This allows users to pick whatever they need when they need it, without having to hope that collections and such properly use try_.

Ericson2314 commented 5 years ago

@gnzlbg that is basically what I have done :).

I'm almost done with the rebase. I'm going to clean that up and then amend the OP to actually be clear this time!

SimonSapin commented 5 years ago

I feel that this thread is hopelessly fraught with confusion about fallibility of methods of allocator handles (like Alloc::realloc) v.s. fallibility of methods of collections (like Vec::push). Should we open separate issues for each of these and make it clear which is which?

Ericson2314 commented 5 years ago

@SimonSapin Yes I did confuse people, but they are related in my plan. Basically ever layer should be as polymorphic as possible, kicking the fallibility decision up to the final consumer / root crate.

SimonSapin commented 5 years ago

@Ericson2314 We are not in your head. You need to communicate with people who do not have the context of everything you’re thinking about.

Ericson2314 commented 5 years ago

@SimonSapin indeed I do. Does the OP make sense now?

SimonSapin commented 5 years ago

Yes, this is better.

There’s a downside to this approach: since they require different allocator handle types, you have to pick for any given collection one of fallible or infallible allocation. In today’s Nightly Vec::try_reserve (with an inhabited error type) and Vec::push are both available on a given vector.

gnzlbg commented 5 years ago

There’s a downside to this approach: since they require different allocator handle types, you have to pick for any given collection one of fallible or infallible allocation.

That's a big downside. I can imagine that one wants to, for a particular collection, handle fallibility in some part of the code, and consider the collection infallible in others.

I don't know how we could solve this. I suppose one option would be some sort of conversions, e.g., allowing let v: Vec<T, InfallibleAlloc> = (v: Vec<T, FallibleAlloc>).into(); (using .into for exposition only, we can have other methods or traits for this), if the allocators are "convertible" to each other.

Ericson2314 commented 5 years ago

I think I wrote some as and as_mut methods (could do into too) to work around that problem.

Ericson2314 commented 5 years ago

https://github.com/glandium/rust/compare/box..QuiltOS:allocator-error#diff-3cc290bf2bc1a0ea48cb2b517e021d0cR569 Here it is.

SimonSapin commented 5 years ago

I don't know how we could solve this.

We solve it by not making fallibility (a relevant) part of the type. Both fallible and infallible methods are always available. The latter are implemented by calling the former, matching, and calling std::alloc::handle_alloc_error on Err.

An associated error type is potentially useful if allocator impls want to carry extra information in error values. But IMO making the error type uninhabited is not that useful at all. At the allocator handle level I would prefer to have a convenience wrapper type or separate trait that provides an infallible API signature and takes care of calling handle_alloc_error.

Ericson2314 commented 5 years ago

@SimonSapin No that's a bad plan because then you loose fallibility polymorphism and we're left with the ecosystem split.

Ericson2314 commented 5 years ago

[FWIW I'm not even sure how important the mixing use-case is. If I care about some error, why would I panic on it some of the time and handler it other times? I think a better idiom for robust software is to never use AbortAdapter and manually unwrap just the Results which cannot be handled in a better way. This makes the tech debt clear rather than shoving it under the rug. I don't hold this opinion so strongly that I wouldn't implement the casting methods, but I thought I should share it in hopes that we don't sacrifice something I consider incredibly beneficial like fallibility polymorphism for something I consider barely worth it even without that cost.]

Ericson2314 commented 5 years ago

I mean the general rust error handling story is:

unwrap is bad
If you can't handle the error now, Use ?to escalate the issue to your caller.
Even though a bunch of ? and one top level unwrap, and unwrap everywhere all have the same behavior, the former is still better for being fewer syntactic uses of unwrap.

I don't see why allocation failures should be at all different. AbortAdapater is good vis-à-vis 3, but ? is still the best. Frankly, I hope most code is already using ? so adding a variant to your error type for allocation errors isn't that big of a burden. The cases where one uses try_ today become more instances of:

"Oh, I could just escalate this problem upstream but let me try again a different way."
"Let me turn this lower-level error into a higher-level domain-specific error so my caller has an easier time handling this separately."

As already happens with Result-based error handling.

SimonSapin commented 5 years ago

What is fallibility polymorphism? Why is it important or useful?

Ericson2314 commented 5 years ago

@SimonSapin Thanks for asking, I thought I hit this in point 3 in the new OP, but now I'm remembering more nuance from things I wrote down years ago but not yet in this thread.

"fallibility polymorphism" is what I'm calling "allocation failure error type polymorphism". It's the ability to write code, be it collections or "application code" or whatever else with a polymorphic error type, so the code can be used by code that never wants to thinks about allocation failure, code that wants to never panic, and everything in between.

If Rust were going 1.0 in 2021, I'd say the collections should always return Result<_, AllocError> on all their methods. There should be no handle_alloc_error, no need for an associated error type for this (though it may still be good for allocator-specific diagnostics), and users not wanting to handle allocation failure should use ?, which should be ergonomic enough.

But Given that the non-Result-returning methods do exist, I don't propose that. There are two choices:

No associated error type. Implement the legacy method with try_method(..).unwrap_or_else(|| handle_alloc_error()).
My plan. Use associated error type and AbortAdapter. Legacy methods require A: Alloc<Err = !>.

So, contrary to what I wrote in the OP. collections code reuse is not a unique benefit to my plan. (My apologizes, I'll amend.) But there's still the ecosystem split problem with plan 1.

Let's assume that we have some libraries, all of which use collections / alloc:

bar that doesn't care about allocation failure
baz that wants to audit all panics
foo that wishes too support a range of uses including bar and baz above. However foo doesn't want to bother defining its every method twice because that's annoying and it's not constrained by backwards compatibility, so it just defines Result<_, Err<..>> returning ones where E<..> includes allocation failure and may or may not have type parameters.

(bar and baz are basically my two extremes from above, except skipping the farthest extreme of "never panic" since that's easier to audit.)

Here are the problems:

The small problem. bar doesn't want to use ? on the methods from foo because it nowhere-else cares about allocation failure: Maybe it could be cajoled into doing so except it is already happy using the non-Result-returning methods from collections and doesn't want to change. unwrap works but looks bad, and maybe you unwrap to much by accident.

Solution: Result<_, !> can have a special total unwrap (like the extension methods in the void package) that is always safe to use. bar can use that instead to ignore allocation errors while being confident it isn't ignoring anything else.
The big problem: baz wants to use foo (and collections) but is worried about itself or some dependency using the fallible methods by mistake. If the legacy methods are always callable, it's too easy to use them by mistake.

Solution: With the AbortAllocator plan, the legacy functions can require A = AbortAllocator<A1>. This means the methods cannot be so easily called, and it's easier for baz audit for the single pair of casting functions. (In itself and foo.)

Extra Credit: If the methods require AbortAllocator<Global> then if foo exports an alloccator-polymorphic interface baz can be confident that foo doesn't use the legacy methods because it is impossible to cast from A to AbortAllocator<Global> due to parametricity. It doesn't even need to audit foo's code for this (though it still should for unsafe tricks in general, including those that could get around this), but just it's interface to unsure the cast methods aren't use.

SimonSapin commented 5 years ago

If Rust were going 1.0 in 2021, I'd say the collections should always return Result<_, AllocError> on all their methods.

I think this proposal is a lot more controversial than you seem to assume.

users not wanting to handle allocation failure should use ?, which should be ergonomic enough.

It’s really not. Most code is in functions that do not return a Result. In functions that do, the error type might not implement From<AllocErr>.

scott-maddox commented 5 years ago

users not wanting to handle allocation failure should use ?, which should be ergonomic enough.

It’s really not. Most code is in functions that do not return a Result. In functions that do, the error type might not implement From<AllocErr>.

Unless I'm misunderstanding @Ericson2314, I think a more clear way to phrase the original suggestion might be something like "library code that is ambivalent about allocation failure should use ?, which should be ergonomic enough." That library code can impl From<AllocErr> for its error type. The user code that doesn't care about allocation failure should probably just unwrap.

SimonSapin commented 5 years ago

Forcing everyone to sprinkle unwrap everywhere when they want to use Vec is not “ergonomic enough”. (Same with an hypothetical Result::<T, !>::unwrap_never that doesn’t panic, or whatever its name. It still needs to be called.)

Remember that I’m responding to this, which seems to be the premise for everything else here:

If Rust were going 1.0 in 2021, I'd say the collections should always return Result<_, AllocError> on all their methods.

Ericson2314 commented 5 years ago

@SimonSapin Rust is not pre-1.0, and so I'm not proposing getting rid of those methods. It's not the premise for everything else; in fact it's the opposite. After defining "fallibility polymorphism", I realized it's not the only way to avoid splitting the ecosystem; a single AllocErr is fine *if all allocating methods Result<T, AllocErr>`. I wanted to come clean and say it's not the only way.

The example situation below the line is suppose to show very explicitly why no error polymorphism / fallibility polymorphism in the presence of those methods is no good. If you like, skip the top part and read that.

@scott-maddox Yeah that is a lot closer to what I meant. The whole question of splitting or not splitting the ecosystem hinges on what libraries can do to support both users cases. liballoc collections must support both sorts of methods due to backwards comparability, but it's a lot to expect other libraries to also do so. That's a lot more of a nuisance all the other things mentioned so far.

Even user code using unwrap is no good, though (that is problem 1 in my example below the line). It's way to easy to below up other errors besides allocation failure. But with error isomorphism, libraries like bar would have methods would return Result<_, LibraryErrorType<A::Err>>, and user code not caring would use AbortAllocator so that they end up with a Result<_, LibraryErrorType<!>>. Now there's effectively no allocation. failure left.

(Same with an hypothetical Result::<T, !>::unwrap_never that doesn’t panic, or whatever its name. It still needs to be called.)

OK first of all that's still way better than unwrap because there's no risk on panicking on more than allocation failure. Ergonomics is about maintenance not just banging out the first draft.

That said, the situation is still far better than even one unwrap_never per every allocating Result method. Picking up where I left off responding to @scott-maddox, one only needs to use unwrap_unsafe once in a impl From<LibraryErrorType<!>> for UserErrorType. which is otherwise the exact same from impl they would have written anyways (impl From<LibraryErrorType> for UserErrorType) had the library panicked on allocation failure too.

gnzlbg commented 5 years ago

I just want to note that the main reason we have a std::prelude::v1 is to be able to do breaking changes to the std lib API if we need to. That is, we could add a prelude::v2 with incompatible collection types (and a compatibility shim), that, e.g., makes the allocation methods of the collections return Result.

This would be obviously very painful, e.g.,

// crate A - edition2018
fn foo() -> Vec<i32>;

// crate B - edition20XY prelude::v2::*
fn bar() {
     let x: Vec<i32> = A::foo();  // ERROR: type mismatch: v2::Vec<i32> != v1::Vec<i32>
}

probably as painful as upgrading a dependency graph from futures 0.1 to futures 0.3. And as @SimonSapin argues, making all allocation methods return Result impacts ergonomics, so AFAICT it isn't a clear win.

But if we can't resolve #2 to make type parameters unstable, we might actually have to move the collections to a prelude::v2 anyways (and hopefully just have prelude::v1 expose type Vec<T> = v2::Vec<T, alloc::Global>; but maybe that doesn't work either and we will end up with incompatible types).

SimonSapin commented 5 years ago

The prelude module only contains pub use reexports. Vec is defined in the std::vec module.

I think that std::new_vec::Vec that behaves differently from std::vec::Vec would be confusing. If we go with incompatible base types I think they should have a different “base” name like std::vec::NewVec, so that they’re distinguishable even after being imported into a given scope. Swapping the meaning of a given name in the prelude would be especially bad IMO.

to be able to do breaking changes to the std lib API if we need to

This is at best very misleading phrasing.

gnzlbg commented 5 years ago

This is at best very misleading phrasing.

Sorry, breaking changes to the prelude that users get by default would probably be more accurate.

Ericson2314 commented 5 years ago

@gnzlbg Your zulip thread seems positive so far so I hope #2 works out. Even if we would somehow remove those methods it wouldn't be a breaking change to Vec itself.

Before this thread meanders somewhere else, does my example above in https://github.com/rust-lang/wg-allocators/issues/10#issuecomment-489763190 about ecosystem splitting make sense?

Ericson2314 commented 5 years ago

Closing for #23 because this thread is indeed messy.

rust-lang / wg-allocators

Fallible allocation (via Associated Error type) #10