Open aturon opened 6 years ago
Hi, this feature will require oom
and try_reserve
to be implemented. I'm needing try_reserve
and I'm willing to work (slowly) on it if noone else does.
I see that the previous commit by @Gankro is here and seems to match the RFC description.
Is there anything else I should consider before rebasing that PR?
I can use some mentorship on what's the best way to implement oom=panic/abort . Should I emit a global variable through rustc depending on this case?
I was taking inspiration from https://github.com/rust-lang/rust/pull/32900 but there might be an easier way to emit a boolean global variable.
cc @Gankro, can you help mentor?
try_reserve
methods are implemented in Nightly, but the accepted RFC also includes the ability to make allocation failure cause a panic rather than abort and that part is not implemented. Unfortunately the RFC does not discuss how this would be implemented, other than being controlled by a flag in Cargo.toml
. It’s not clear to me how this can work.
Unfortunately the RFC does not discuss how this would be implemented, other than being controlled by a flag in Cargo.toml. It’s not clear to me how this can work.
We've been talking about panic::set_hook-like hooking for the new oom lang item in #49668, this could be how this works.
@alexcrichton, how does https://github.com/rust-lang/rust/pull/51041 fit with the goal of implementing some way to opt into oom=panic?
@SimonSapin we'd decide that the regression is acceptable, and we'd revert the change.
You use a conditional tense, but oom=panic
is already in an accepted RFC (though the implementation strategy is not quite settled).
Ok sure, we can figure it out in the implementation details? AFAIK there's no need to revert the change as it has basically no benefit today. It's a trivial change to revert as well, it's one line in the standard library
This example in playground asks for 1024*1024*1024*1000
bytes on a server, via v.try_reserve(len)?
without giving an error.
Is this RFC implementation is not at the point, at which errors in such cases will start to show up?
Or, is this a gross over-commit by underlying OS, and this state of affairs will stay like this?
Can try_reserve
in principle reserve only realistic things?
When an error is returned is entirely dependent of the underlying allocator. As far as I know Linux usually overcommits, Windows doesn’t.
https://play.rust-lang.org/?gist=aa668f38117983d1951b58307df04880&version=nightly allocates 120 PB but fails at 130 PB, it is exhausting 48 bits of address space.
Even on Linux though, this API can be useful for 32-bit programs when exhausting the address space is much more realistic.
https://github.com/rust-lang/rust/issues/43596 was the pre-RFC feature request for oom=panic
. I’ve repurposed it as the tracking issue for that feature, since it and try_reserved
are likely to be stabilized at different times. This leaves this issue as tracking only try_reserve
.
https://github.com/rust-lang/rust/pull/52420 will change CollectionAllocErr
to CollectionAllocErr<AllocError>
Will, or would. I’ve comment there that such a change in public API design merits being discussed separately from a large PR.
Yeah definitely. Even if it weren't for the 2018 delaying considering these things, I wouldn't expect it to be merged quick. I just make the PR to prove the feasibility and make my ideas clearer to / interactive for others.
Ignoring the benefits of allocator polymorphism itself (that would be https://github.com/rust-lang/rust/issues/42774), I see 4 big benefits:
Self discipline: Applications can prevent themselves statically from forgetting to handled allocations failures.
Race freedom: allocation divergence due to try_reserving
insufficient space are statically prevented.
Expressiveness: Unclear how to do try_reserve
-like things for "deep" data structures (those containing indirecation).
!
to get today's divergence, or AllocError
to allow them to be explicitly handled.I thus advocating going with error polymorphism and abandoning try_reverve
(I rather reuse try_
for the Result<(), E>
copies of every method we'll need for backwards compat). (But, my PR keeps the existing meaning of try_reserve
so as to allow a transition period where people can try both approaches.)
Firefox needs try_reserve
badly enough that they forked Vec
and HashMap
over it. It’d be nice to stabilize.
Some of the discussion above is about oom=panic
whose tracking has moved to https://github.com/rust-lang/rust/issues/43596. This leaves two unresolved question:
Whether CollectionAllocErr
should gain a type parameter, with the AllocErr
variant holding a value of that type, in preparation for future allocation traits having an associated type for errors. I think we can still make that change after stabilization, if the type parameter has a default. @rust-lang/libs, does that sound accurate?
@Ericson2314, it sounds like you’re objecting to having try_reserve
at all, because instead we should have try_*
variations of every existing method that allocates. However reserve
is such a method so this proposal is a subset of yours, not contradictory. Am I missing something?
@SimonSapin Yes if things were done my way Firefox wouldn't be try_reserv
ing or reserv
ing at all, but just fallibly allocate the memory they need when they need it. That's much simpler and more robust.
(There's also the more minor quibble that under the simplest version of my plan, the Self
types are different and try_reseve
wouldn't be callable for the regular infallible collections. But that can be changed at the cost of more associated type magic.)
The libs team paused associated allocators to prioritize the 2018 edition. That's fine with me, but then I think it only fair that Firefox wait too. I promise as soon as the gates are open I'll rush to revive @glandium's and my PRs.
Right, try_insert
for example would be preferable to try_reserve
+ insert
, but it doesn’t exist yet. You know as well as I do that large changes or additions are harder to get through the process, https://github.com/rust-lang/rfcs/pull/2116 was designed deliberately to be minimal.
I think it only fair that Firefox wait too
It’s not about who gets candy first. RFC 2116 was submitted 14 months ago and accepted 8 months ago. The try_reserve
part has been implemented 7 months ago, stabilizing it now can be a matter of flipping a switch. Allocator-generic collections however are still in the design phase with a number of open questions, they are still gonna take significant work from multiple people.
That said, there is no real urgency for try_reserve
. Firefox using a forked HashMap
is unfortunate in my opinion, but it works. Stabilizing try_reserve
only seems to me like easy incremental improvement.
Well it sounds like we are in something closer to agreement than usual. :)
Allocator-generic collections however are still in the design phase with a number of open questions
The code has been written and if I recall correctly works (there was one ancillary issue with trying to make default impls work well, but it is not fundamental and I believe I found a workaround beyond just requiring more boilerplate). It just needs to be read and debated.
@SimonSapin it's been awhile since I took a look at this, but is the idea that (in the long run) every fallible method on a collection returns CollectionAllocErr<T>
and the T
is specialized per-method to be the "most appropriate"? In the meantime, though, we're only thinking of the try_reserve
methods which would effectively return CollectionAllocErr<()>
and you're thinking that we can add <T = ()>
to the enum definition later?
One possible alternative to this is to avoid defining this type at all I think? For example unlike as we may have originally though AllocErr
wasn't stabilized with global allocators. To that end it increases the surface area of what's to be stabilized to stabilize this feature, and it also means we're not necessarily sold on having AllocErr
yet. These methods could, perhaps, return a bool
for whether they succeeded or not. (this sort of depends if we think it's crucial to switch on why reservation failed, allocation and/or capacity)
I think I'd personally lean towards what you're thinking @SimonSapin by having CollectionAllocErr
with a defaulted type parameter later on, if necessary. I might also advocate, as well, having either a non-exhaustive enum or an opaque struct with boolean accessors.
Oh, I forgot that AllocErr
was not stable yet. I’m fine with changing CollectionAllocErr
to an opaque type for now. (A struct with a private field, to leave us more options later.) Even zero-size, though I’d still prefer Result<(), _>
over bool
even if they are semantically equivalent.
At the moment I don’t really have an opinion on whether CollectionAllocErr
should have a type parameter. I was only responding to https://github.com/rust-lang/rust/issues/48043#issuecomment-409600561 by saying we’ll still be able to do that later if we want. (With the RFC process hat on, to make sure possible objections are not ignored.)
This sort of depends if we think it's crucial to switch on why reservation failed, allocation and/or capacity
It looks like at least some of the code paths in Firefox track the size (and alignment) of the allocation request that failed, so the current enum might be insufficient anyway.
Alright, on further thought it may not make sense to pursue this just yet, until we figure out the rest of the story for allocation traits and error types.
@alexcrichton So in https://github.com/rust-lang/rust/pull/52420/files I did make a CollectionAllocErr
, but that's not really the important part, the associated error type (which would be the parameter) is.
Honestly, I'm not even sure CollectionAllocErr
is worth it. It creates two axis with the associated type: do we hide allocation errors, and do we hide size errors? I can't imagine a case where one should cause an panic/abort and the other shouldn't. I rather have just AllocErr
as is, or have AllocErr
have both variants (In the memory hierarchy we can imagine that at any layer some collection/allocator gets too big, or it cannot get memory from it's backing allocator).
All this another reason, and one I initially forgot (but now remember thinking of when I rebased my PR over try_reserve
was added), that stabilizing try_reserve
now will tie our hands later.
I think this is a feature that we need semi-urgently. There are projects out there that require try_reserve
and this is not something that can be quickly replaced by a custom crate.
I like the idea of CollectionAllocErr
with a private field but I'd like to know if anyone can think of a limitation before proceeding in implementing it.
As far as I can tell, the error type is the only remaining blocker. How about something like this?
pub struct ContainerAllocError {
layout_of_failed_allocation: Option<Layout>,
}
impl ContainerAllocError {
pub fn layout_of_failed_allocation(&self) -> Option<Layout> { self.layout_of_failed_allocation }
}
A struct with private fields (and no public fields), so we could later add more fields or even turn it back into an enum
No type parameter, but we can add one later (with a default) if the Alloc
trait grows an associated type for errors and we want to propagate custom error values
Container
instead of Collection
in the name, in case we want to add Box::try_new
at some point.
std::collections
, or move to another module? Which one?Can contain the Layout
that was passed e.g. to the alloc()
called that failed. Having this info (or at least the size) is a requirement for Firefox, but for example the allocation strategy of HashMap
is an implementation detail that users should not have to guess at based on HashMap::capacity
alone.
layout
, with an explanation in the doc-comment.Speaking of bikesheds, *Err
or *Error
? We stabilized LayoutErr
, but the precedent for types named FooError
is much stronger in https://doc.rust-lang.org/std/error/trait.Error.html#implementors
Alternative definition:
pub enum ContainerAllocError {
CapacityOverflow,
AllocErr {
layout: Layout,
#[unstable(feature = "container_alloc_error_extra")] error: AllocErr,
},
}
Having a public field for a zero-size type is a bit awkward, but the key idea is that it can stay unstable while the rest of the enum is stabilized (together with try_reserve
methods that return it).
When https://github.com/rust-lang/wg-allocators/issues/23 is resolved, based on the outcome we either remove that field:
pub enum ContainerAllocError {
CapacityOverflow,
AllocErr { layout: Layout },
}
… or make its type be a new type parameter of the enum:
pub enum ContainerAllocError<E = AllocErr> {
CapacityOverflow,
AllocErr {
layout: Layout,
error: E,
},
}
impl SomeContainer<T, A: Alloc> {
pub fn try_reserve(n: usize) -> Result<(), ContainerAllocError<A::Err>> {…}
}
@SimonSapin and I disagree on whether it is superior, but my fallibility-polymorphism in https://github.com/rust-lang/rust/pull/60703 is almost ready (see how https://github.com/rust-lang/rust/pull/58457 was just rebased, I just rebased).
Oh, thanks @SimonSapin for mentioning my issue in the second comment I did not see that the first time sorry. For the record, #60703 goes with the pub enum ContainerAllocError<E = AllocErr>
approach, which is more useful because ContainerAllocError<!>
correctly ensures only CapacityOverflow
is possible since all all allocation failures will panic/abort.
(At a risk of re-hashing arguments already made elsewhere, my view is that when panic/abort on allocation error is desired there is no need for try_reserve
over reserve
. So ContainerAllocError<!>
is not a useful construct IMO. But the proposed unstable field to force non-exhaustive matching leaves the door open either way, so this doen’t need to be decided in this thread.)
PR for the above: https://github.com/rust-lang/rust/pull/61780
The PR takes into account my concerns and so looks fine to me. Layout
is a good idea.
@SimonSapin
(At a risk of re-hashing arguments already made elsewhere,
I think that debate should happen in this issue? After all with my plan for fallibility/allocator polymorphism doesn't really point to try_reserve
as written in the RFC being a good idea, even if it is technically possible to support it.
My view is that when panic/abort on allocation error is desired there is no need for
try_reserve
overreserve
.
My argument hinges on one key observation: the code using reserve
/try_reserve
may not want to decide whether to panic/abort or not on allocation failure, but rather leave that decision to to the user of that code.
In general, we could have arbitrary amounts of collection and and other library code that leaves that decision to only the final executable. There's no good way to write that today without exposing try_
and panicking versions of every method---maybe the stdlib collections have to do that anyways for backwards compat, but lets not put that burden on every other library. You can just expose the try versions and rely on users panicking themselves, but that's very annoying and tedious, and users are likely to make mistakes like panicking on too much if other non-allocation errors are possible. Put shortly, we should never tell library authors to write 2 copies of every function, or anyone to unwrap
.
With the polymorphic alloc error, and a safe Result<T, !>::void_unwrap
function, library authors can write just one function, and users can do something that, while a bit tedious, is more importantly completely safe. And it lives open the door to even less fiction with e.g. a Result<T, !>
<=> T
coercion.
https://github.com/rust-lang/rust/pull/61780 has merged. I believe it resolves the remaining issues. I’ve updated the API signatures in the issue description.
Is there anything else to tweak here before stabilization? CC @Manishearth for the hashglobe
crate, @rillian for mp4parse_fallible
@rfcbot fcp merge
Team member @SimonSapin has proposed to merge this. The next step is review by the rest of the tagged team members:
No concerns currently listed.
Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!
See this document for info about what commands tagged team members can give me.
What's the deal with the possible custom error value in TryReserveError::AllocError?
Does that involve later possibly adding a type parameter with a default value to TryReserveError? If so, that would break code that constructs CapacityOverflow because default type parameters only come into play in type position not expression position.
let _ = TryReserveError::CapacityOverflow;
error[E0282]: type annotations needed for `TryReserveError<E>`
--> src/main.rs:22:13
|
22 | let _ = TryReserveError::CapacityOverflow;
| - ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ cannot infer type for `E`
| |
| consider giving this pattern the explicit type `TryReserveError<E>`, where the type parameter `E` is specified
Does that involve later possibly adding a type parameter with a default value to TryReserveError?
Correct.
If so, that would break code that constructs CapacityOverflow because default type parameters only come into play in type position not expression position.
Ugh, that’s a good point. Back to the drawing board I guess.
@rfcbot fcp cancel
@SimonSapin proposal cancelled.
Another concern: If we decide to forbid zero sized allocations (https://github.com/rust-lang/wg-allocators/issues/16) the allocator would probably take a NonZeroLayout
or similar.
In order to not block this issue even more, I propose to mark layout
in AllocError
as unstable, too, until the WG found a consens.
At this point I think there are three options here:
Keep blocking stabilization of this feature (which has been implemented for 19 months now) for an unbounded amount of time until the Allocators WG decides whether allocators should have an associated error type https://github.com/rust-lang/wg-allocators/issues/23. Note that WG’s repository has had very little activity in the last 5 months, after a brief surge when in was created.
Or take a bet that standard library collections will eventually be made allocator-generic with an associated error type, and stabilize try_reserve
with a generic payload
field in TryReserveError::AllocError
that defaults to ()
.
The risk is that if the bet turns out wrong, we’ll have a useless field and type parameter that is always ()
, in a public type.
Or take a bet that standard library collections will not be made allocator-generic (that feature might end up on parallel collection types, possibly outside of the standard library), or that they will but with a concrete zero-size error type, and stabilize try_reserve
and TryReserveError
without a payload
field.
The risk is that if the bet turns out wrong, we’ll have try_reserve
methods that drop the error value it receives from the allocator.
My preference is for 2 or 3, but I don’t have a strong opinion between them. @rust-lang/libs what do you think?
Note that WG’s repository has had very little activity in the last 5 months, after a brief surge when in was created.
It is true that after the repository was opened, a few months was silent. Recently, however, there was an increase in activity again and there is a WIP PR that is currently blocking the WG from pushing anything upstream.
So I would have suggested to make the errortype non-exhaustive. The non-exhaustiveness can still be removed later.
I'm personally pretty wary and against stabilizing something just because progress isn't happening on it. Stabilization takes real work and has a cost. Simply because an API exists doesn't mean we should stabilize it, even if it's been sitting for
Fallible collections are hard in Rust. There's really no way we can escape that. At this point it still seems clear that the design work that needs to be done has not been done. It also seems like no one really knows how to get the design work done.
I agree this would be a great feature to have, and yeah this has sat around forever as unstable. That being said the length of time something has sat is not at all a justification for stabilization in my opinion.
I think it's pretty bad design to get things into the standard library, then point out it's sat with no feedback for months and propose stabilization.
My understanding was that “Should try_reserve
’s error type carry an error value that generic allocators may return” was the only remaining question that needed resolving. I wrote https://github.com/rust-lang/rust/issues/48043#issuecomment-543952770 on that basis.
But it sounds like you have concerns that are larger than that?
It sounds like your first bullet in that comment has quite a lot of exasperation flowing through it about how slow this process is, and it also sounds like you're drawing attention to emphasize your case about how everyone else should be exasperated about it as well. I understand this process has been slow, and I think it's pretty reasonable that you're exasperated.
I don't think, however, that the exasperation is a justification for stabilization one way or another. We still do not have information about how allocators in the standard library will work out (per collection). That work just simply hasn't been done. You're proposing that we bet one way or another on this API. I think betting is fine, but I want to be very clear that exasperation at the slowness of the process is not a reason to stabilize something. It's can be a reason for stabilization because the functionality is useful enough, but it's generally unproductive in my opinion to vent such exasperation while simultaneously proposing stabilization.
I do not personally have an opinion about how this API should turn out, I don't really want to wade into the design issues of custom allocators on collections personally. I again just want to point out that the exasperation isn't particularly helpful and should never be a reason for stabilization.
Now that some more work has happened on the design of allocators, I think I would prefer to make all methods of a particular Vec
that allocate panic on OOM (instead of aborting), by using something like a PanicOnOOM<MyAlloc>
allocator adapter, e.g., like:Vec<T, PanicOnOOMAlloc<MyAlloc>>
.
That is, instead of returning a null pointer on OOM from the alloc methods, such an allocator would just panic, allowing the user of Vec
to catch the panics. This would make not only a handful of allocating methods fallible, but all of them. And well, if you only want "fallible" allocation for some sections of the code, you can just go from a Vec<T, MyAlloc>
to a Vec<T, PanicOnOOM<MyAlloc>>
for that particular section and back, and then you are guaranteed that no matter how that code gets refactored (e.g. somebody adds a call to Vec::push
somewhere instead of Vec::try_push
, you'll get a panic).
This means that Vec::try_push
and Vec::try_reserve
become maybe a little bit less useful, since if unwinding is available, you can just use Vec::push
and Vec::reserve
(or extend
, or anything really). However, for -C panic=abort
, unwinding doesn't work, so this solution won't work there. To support this use case, Vec::try_{push,reserve}
are very much necessary, since we need to manually propagate a Result
value up the stack to allow recovery.
This approach also has some other implications, e.g., about how the API of the Alloc
trait must specify what's valid for an allocator to do on OOM, but I guess that can be simply: if the alloc functions return a non-null pointer, then allocation succeeded, else, allocation failed (but that does not imply that the allocation function returns normally at all: it can abort, panic, execute some callback, etc.). This allows all sort of wrappers that let you "hook" your own "on_alloc_error
" callbacks, on a per allocator handle basis.
The RFC that proposed try_reserve
also separately proposed an opt-in for panic instead of abort, so that unwinding could be an additional option to recover from OOM:
Under that proposal it’s handle_alloc_error
that would change behavior. GlobalAlloc::alloc
would still be expected (or encouraged) to return null
on errors. It also does not requires allocator-generic containers.
@SimonSapin any update?
We now have a renewed effort at https://github.com/rust-lang/rust/pull/72314 so we can experiment with try_*
methods across the board. I wish (albeit from the sidelines) that such a PR had landed before so we could already be experimenting with those things, but even though that's gone slower than I hope I still think it's prudent to wait and not stabilize this.
@malbarbo I’m not sure what you’re expecting, this is the thread where further discussion about try_reserve
should happen. As you can read above, the current status is that stabilization is blocked on deciding what extra data if any TryReserveError::AllocErr
should carry, which ties into whether generic allocators should have an associated error types. And more generally there is not good consensus to stabilize before generic allocators overall are more settled.
@Ericson2314 https://github.com/rust-lang/rust/pull/72314 is "Stability annotations on generic parameters", which would help making containers generic on the allocator type, but how is it related to try_*
methods? For example Vec
can have try_reserve
without an extra type parameter.
ties into whether generic allocators should have an associated error types.
I don't think this is going to be fully resolved until we get some allocators on collections upstream and this is currently blocked by #72314.
I don't really see, how #77118 is related to this? Where to you want to add an unstable generic annotation?
Previous comment mentioned it was blocked on #72314, and #72314 has been replaced by #77118.
This is a tracking issue for the
try_reserve
part of the RFC "fallible collection allocation" (rust-lang/rfcs#2116).Steps:
HashSet::try_reserve
: https://github.com/rust-lang/rust/pull/58623API: