Closed froydnj closed 6 years ago
I think this is pending the custom allocator work.
Yes, in general, the standard library assumes that you can't really handle OOM. Using libcore and custom allocators, you could implement whatever behavior you'd like, but you'd still need to re-implement this kind of thing on top of it.
The standard library is going to operate this way for the forseeable future, as it would require changing the signatures of stable things (like vec!
) in a backwards-incompatible way. So I'm going to give this one a close.
So I should have been clearer about this: I don't think that all operations would need to return bool
or Option
or whatever. Firefox's data structures (which I'm most familiar with for something like this) typically have an "infallible" (i.e. panic on allocation failure) and "fallible" (return error value on allocation failure) variant of all the appropriate operations. Simply adding fallible operators, while increasing the API surface quite a bit, should also mean that this work could be done in a backwards-compatible way, as the current (infallible) API wouldn't have to change. So you'd have vec!
and vec_fallible!
, for instance, Vec::with_capacity
and Vec::with_capacity_fallible
, and so forth.
To my mind, it'd be better for the standard library to have this sort of thing, rather than folks having to write their own crate for VecFallible<T>
or whatever and needing to propagate that through dependencies.
Does that seem reasonable, @steveklabnik ? I'm afraid I'm ignorant of the custom allocator bits cited herein, so I can't really tell if that's a way to modify the allocation behavior of Vec<T>
or something else.
Independent of the custom allocators, i.e. Vec<T, A> where A: Allocate
, I guess Vec
would still have to expose fallible versions of these operations in order to avoid A
's default out-of-memory behavior. This is possible to implement right now (RawVec
manually calls oom
, for example), but probably needs an RFC.
Yeah this is a tricky problem. I'd really not like to bloat up the collections APIs with _fallible variants if we can avoid it. However custom allocators will do nothing for this, as apasel notes.
To be exhaustive, we'd need to provide fallible versions of:
Vec:
Maps:
Unless you believe that only a subset is "truly" necessary?
Waking up, and this is getting more in cache. Part of this problem is the fuzzy nature of OOM on modern platforms (at least, Linux). It's my understanding that you can happily allocate some eggregious amount of memory from a bad image, only to get smacked down by the OOM killer when you really start trying to use the memory, and there's no way for us to deal with that.
Is there any problem with just having an incredibly reasonable policy like "no images bigger than 100MB"?
I assume this wouldn't be backwards-compatible, but it'd be nice to avoid the proliferation of methods with something like:
#![feature(default_type_parameter_fallback)]
pub trait AllocResult<T = (), E = ()> {
fn ok(value: T) -> Self;
fn err(err: E) -> Self;
}
impl<T, E> AllocResult<T, E> for T {
fn ok(value: T) -> T { value }
fn err(_err: E) -> T { oom(); }
}
impl<T, E> AllocResult<T, E> for Result<T, E> {
fn ok(value: T) -> Self { Ok(value) }
fn err(err: E) -> Self { Err(err) }
}
impl<T> Vec<T> {
pub fn with_capacity<T, R: AllocResult<Self> = Self>(cap: usize) -> R {
let ptr = alloc::heap::allocate(...);
if ptr.is_null() { R::err(()) } else { R::ok(Vec { ... }) }
}
pub fn push<R: AllocResult = ()>(&mut self, item: T) -> R {
if self.needs_reallocate() {
let ptr = alloc::heap::reallocate(...);
if ptr.is_null() { return R::err(()); }
...
}
...
R::ok(())
}
}
#[test]
fn test() {
let vec: Vec<i32> = Vec::with_capacity(5); // aborts on OOM
let mut vec: Vec<i32> = match Vec::with_capacity(5) { // returns `Err` on OOM
Ok(vec) => vec,
Err(_) => panic!("OOM"),
};
vec.push(1); // aborts on OOM
match vec.push(1) { // returns `Err` on OOM
Ok(_) => {}
Err(_) => panic!("OOM"),
}
}
But this is clearly less discoverable and comprehensible for users.
Waking up, and this is getting more in cache. Part of this problem is the fuzzy nature of OOM on modern platforms (at least, Linux). It's my understanding that you can happily allocate some eggregious amount of memory from a bad image, only to get smacked down by the OOM killer when you really start trying to use the memory, and there's no way for us to deal with that.
That's a reasonable point, but on Windows the memory allocator really can fail, and Rust should be able to deal with that relatively gracefully. I don't know about OS X; I'm guessing it's more like Windows than Linux in this regard?
Is there any problem with just having an incredibly reasonable policy like "no images bigger than 100MB"?
A 100MB image is just 5k x 5k x 32bpp (raw pixel data), which is pretty big (roughly 3x 4k monitors), but doesn't sound like anything too out of the ordinary.
I'm having trouble finding details TBH. It would appear that FreeBSD supports overcommit, and by extension OSX/iOS seem to as well, at least to a limited extent. It certainly seems that they will do implicit deduplication and then copy-on-write transparently. Since the COW can presumably fail to find any physical memory (otherwise why bother? cache?), that opens up code to the same implicit failure state. iOS at least will send your application a warning that the OOM killer is eyeing it, giving it a chance to make amends (but this may be futile if other memory-starved applications are eating all the memory you're giving up).
Good old SIGDANGER
IIRC there was some discussion about allowing custom allocators someday. In combination with specialization, could an allocator + impl<T> Vec<T, NonPanicAlloc> {}
be used to create an entirely new set of functions (returning Result
) while blanking out the original functions?
cc me
One could indeed expose different ops/signatures for custom allocators. In fact I believe it could all be hacked on top of associated types, where type AllocResult = ()
for GlobalAlloc or whatever. However it seems like this wouldn't be desirable for the OP? As in, they specifically don't want to have to thread fallibility throughout their types. Only in specific locations do they care. Is that correct?
Wouldn't it be more general to leave everything as it is and offer a way to locally "catch" oom? Similar to catch_panic?
CC @pnkfelix, who has been working on allocator stuff.
As in, they specifically don't want to have to thread fallibility throughout their types. Only in specific locations do they care. Is that correct?
That's correct.
There's also a readability argument to be made: it's easier to tell whether something can fail if you read:
if !v.append_fallible(...) { ... }
or whether something is written incorrectly:
// Forgot to check the return value!
v.append_fallible(...) { ... }
vs.
// Is v an infallible vector or did I forget to check the return value?
v.append(...);
You could address this with #[must_use]
on VecFallible<T>
's methods. Firefox went the route of separate types for its primary Vec
-alike and found it difficult to share code between types and still provide the annotation. It's possible that doing the sharing and using the annotation would be easier in Rust?
Having two different types for vectors can also make for some awkward code: why do I care about the fallibility of allocating operations if the code that I'm writing doesn't actually care about doing allocation on the vector? You could cleverly address this problem with traits or templates, I suppose.
In any event, Firefox is currently in the process of ripping that type distinction out and using separate methods instead. We think it's much more valuable to see the fallibility of the operation at the call site, rather than having it reside on the type of something, since the type may not be visible/obvious at the call site.
We think it's much more valuable to see the fallibility of the operation at the call site, rather than having it reside on the type of something, since the type may not be visible/obvious at the call site.
Yes it should be obvious at the call site and not create type-clutter. But duplicating all the functions is just the same, but instead of having two types, you end up with loads of functions that almost do the same.
With panic
it's basically the same, the standard library has mostly functions that return a Result
or an Option
that the user can unwrap
or handle, instead of panicking functions. On the other hand, there's catch_panic
, to turn a panicking function into one returning a Result
.
A solution similar to catch_panic
would look like
if let Ok(()) = catch_oom(|| v.push(42)) {
// do something
}
if you forget to use the return value, the normal must_use
lint would catch that
if such a wrapper is too much, you can always use a macro to make it less verbose: oom!(v.push(42))
Please no catching of oom. I think panic on oom is untenable for the same reasons, more complicated exception safety issues for lots of unsafe
-marked code. See #26951. The Rusty way of using return types for error handling is a great model.
You can even add the fallible allocation in user code (example) using unstable features, although I guess it's debatable if rust provides a guarantee that it will work.
@froydnj #[must_use]
on Result would cover this actually (it's a per type attribute, not per method).
@bluss, could you elaborate how a failed 1GB allocation (say, for a large photograph) would prevent unwinding later? My understanding is that there are many myths around overcommit, and that the system can actually tell prior to memory access whether an allocation is likely possible.
There will be cases due to overcommit where the application receives a "successful" allocation, and it crashes when trying to use it later. However, that doesn't seem to be the problem, because we can't do much about that.
@nathanaeljones I don't understand the scenario, I don't see why a failed allocation would prevent unwinding later. I may be missing something.
The only Rust code that should have to worry about unsoundness in the presence of unwinding is unsafe
code, and there's a concern that if we introduce new panics in functions that are used in unsafe
code (for example Vec::with_capacity
), then we create new soundness issues. As it is presently, the code can rely on the allocation either succeeding or the program aborting, so there is no need to handle inconsistent states for that critical section of the code (whatever it may be).
It does seem unlikely, and I can't find any problematic code with a quick search, but introducing new panics in stable API is a hazard if there is code that relied on it to never panic.
I was just wondering about this issue and @froydnj pointed me at this PR.
It is a common occurrence in Firefox that we get crash reports that are triggered by OOMs caused by very large allocation requests (e.g. 100 MB or more). Our standard response to these is to make the allocation fallible. Firefox's data structures support this fairly well so it's usually a straightforward fix.
Although aborting on OOM is a reasonable default, I'm worried that Rust doesn't have a graceful way of recovering from a 100 MB+ allocation request failing.
As for Linux overcommit: at least for Firefox, to a first approximation, Windows is the only platform that matters. So I wouldn't focus too much on overcommit.
FYI: in Firefox we are planning to implement FallibleVec and FallibleHashMap classes. This will likely involve copy-pasting the Vec and HashMap code and then adding the missing fallible operations. Which is awful, but necessary. (Strings might be necessary, too.)
We regularly change allocation sites in Firefox from infallible (which is the default) to fallible in response to crash reports that show OOM failures due to large allocation sizes. These are typically on Win32 where OOM due to address space exhaustion is common when doing large allocations. See https://bugzilla.mozilla.org/show_bug.cgi?id=1329171 for an example. We need to be able to do the same thing in Rust code.
Hey all, this has become high priority for the Firefox devs (which includes me) as they integrate more major Rust components into Gecko, so I'm going to start drafting up a proposal for providing some version of fallible collection methods to avoid fragmenting the ecosystem over this matter.
Some quick thoughts:
Don't care. This can make software more reliable when it works. Sometimes it won't. Oh well.
I firmly believe that a type-level distinction would be bad to have by default. Fallibility is fairly fluid in practice, and is nice to be able to integrate incrementally. Also if we provide fallible methods it's not a big deal for someone to newtype std collections to only expose the fallible APIs.
There are three tacts we can take here:
minimal: just try_reserve
, try_reserve_exact
, and try_with_capacity
. Users can impl e.g. try_push
with { try_reserve(1)?; push() }
. I expect this will cover the most important cases well, with minimal commitment from std. However some methods like extend
may still infallibly allocate and there's no way to build a perfect wrapper to prevent this.
maximal: do all the methods. Perfect implementations, total commitment to mirror all APIs which "may" allocate. I have a feeling we're going to end up here one day, but this is a big commitment to make right away. Might lead to Intense Macroification of libcollections (😿).
half-n-half: Do some major methods, skimp on niche ones. e.g. provide try_extend/try_push. I expect this will just make everyone mad. Especially me when I get constantly pinged about why there's no try_entry or try_splice.
I'm going to push for minimal for now, with an intent to propose maximal when we have experience with these APIs under our belts.
try_* methods will return a Result, with an Error format based on the new Allocator APIs. I need to review that RFC and the format, which largely went down while I was absent from the community last year.
I expect methods like try_push will need to return the pushed element on Err; I don't want to have to propose a callback-based or Entry-style API for this.
Capacity overflow panics -- might make them a state in the Err? TBD!
@gankro: thank you for the write-up. A few notes.
try_with_capacity
is unnecessary, because you can do new
followed by reserve
or reserve_exact
, depending on the container. Which is good because then you don't have to deal with with_capacity_and_hasher
for HashMap and HashSet.reserve
/capacity
methods because they allocate a big contiguous chunk.reserve
/capacity
methods because, as I understand it, they allocate a single small chunk or lots of small chunks rather than a single big chunk -- if allocation of a small thing fails you're probably about to OOM anyway, so that's not an interesting case to try to recover from.@Gankro
try_* methods will return a Result, with an Error format based on the new Allocator APIs.
You might want to consider just an Option
. The errors used in the Alloc
trait (AllocErr
) can express one of two error conditions: out of memory (AllocErr::Exhausted
) and an unsupported allocation (AllocErr::Unsupported
). I strongly believe that the latter should be considered evidence of a bug, not a runtime error (as in, you provided an allocator that cannot support the allocations the collection needs to make). Thus, the only error we care about is OOM, and so an Option
suffices - Some
on success, None
on OOM.
That said, there may be forwards-compatibility issues that using a Result
make better, and I'm not experienced enough here to comment, so if that's the reasoning, then I have no concerns. My only comment is to say that we definitely shouldn't have collections returning errors for unsupported allocations - that should just be considered a bug.
I have a fourth API option:
Result
(bear with me wrt backwards compat)!
return type and aborts on errors from the inner allocatorResult<T, !>
to T
(even for fn foo()->Result<T,!>
to fn foo()->T
and &Result<T,!>
to &T
, ...)must_use
to work on enum variants and only require it on Result::Err
Now anyone can choose their preferred method of treating alloc errors, without cluttering the API.
I firmly believe that a type-level distinction would be bad to have by default. Fallibility is fairly fluid in practice, and is nice to be able to integrate incrementally. Also if we provide fallible methods it's not a big deal for someone to newtype std collections to only expose the fallible APIs.
And then we provide a way to convert between fallible and infallible containers while keeping the actual allocator? I'm not really sure how fluid failability is.
@oli-obk this adds a type-level distinction, which I said I'm not interested in providing.
Also, this involves massive language changes, which I'm not interested in pursuing.
@oli-obk that's the best proposal I've seen yet; it's how I wish things had been designed in the first place.
adds a type-level distinction, which I said I'm not interested in providing.
The opposite, cluttering the API with functions that do ~almost the same is not an option imo. I have a hard enough time explaining tradeoffs in Rust to my students. Telling them about
use nonpanicking functions returning
Result
everywhere, except with containers
will not improve the situation.
I think providing an as_fallible
method which converts from &Vec<T, FallibleAlloc<Alloc>>
to &Vec<T, Alloc>
provides the runtime distinction you want.
I'm strongly considering the as_fallible approach right now, although I still don't think we want to make Vec generic, because it requires significant language changes that likely won't ever happen.
We could consider adapting the placement-in API to incorporate the notion of fallibility.
We could consider adapting the placement-in API to incorporate the notion of fallibility.
How would that work?
Make Placer::make_place()
return a Result
. vec.place_back() <- EXPR
would then desugar to early return on failure, not unlike ?
. I’m sure there could be many other options, I haven’t thought about implications of any of the approaches much. catch
then can be used to handle the failure locally.
See also brain dump on IRC.
Vec::reserve()
can fail.
@Gankro I'm really glad to have you taking this on!
A few thoughts:
I agree (and I believe the libs team does as well) that a type distinction is very undesirable. That's partly because, as you say, sensitivity to allocation failure is often context-specific. Moreover, retaining compatibility with std
types (under the default allocator) allows for greater ecosystem interop. In any case, I'd want to see very strong motivation for introducing such a distinction.
For people who want some assurance that they're using only fallible operations, it might be possible to leverage something like the portability lint idea.
I quite like the minimalistic approach you suggest. You can build quite a bit on top of it using extension traits. And if we ever wanted to provide more maximalist functionality in std
, I think we'd likewise want to expose it via extension traits.
The idea of switching all methods to use Result
, while elegant in its way, is not something I can see us seriously considering. (I can elaborate on that separately if desired, but I don't want to derail the thread.)
@aturon et al
We've discussed this with @Gankro on IRC, but for the purpose of visibility I'll go on record here: I believe @Gankro's proposal will handle the majority (if not virtually all) embedded use cases, which progressively fall back to a more complex error handling strategy based on domain requirements:
This also handles the most obvious hosted use case, libraries that cannot afford to bring down their host process on OOM, e.g. those handling user-controlled content such as zip or XML.
Might lead to Intense Macroification of libcollections ().
Why not just replace the implementation of x
with try_x().ok_or(alloc::oom)
?
I think we can do it at the type level while still not using Result everywhere in the infallible case and being backwards-compatible. You just need ATC and improved ergonomics of default type parameters:
trait Allocator {
type Result<T>;
}
impl Allocator for InfallibleAllocator {
type Result<T> = T;
}
impl Allocator for FallibleAllocator {
type Result<T> = Result<T, AllocatorError>;
}
impl Vec<T, A: Allocator = InfallibleAllocator> {
fn push(v: T) -> A::Result<()>
}
Might lead to Intense Macroification of libcollections ().
Why not just replace the implementation of x with try_x().ok_or(alloc::oom)?
I think that's the point, you'd want a macro to generate all those infallible wrappers
We don't OOM on capacity overflow today, so that would be a change to the public API. If we feed the reason for failure through then it might be fine (we can match and choose the approach). There's also the matter of whether the fallible code optimizes to the same thing as the infallible code (more branches + a payload to manipulate).
I've always been a fan of the type-based approach with associated return error types. I think it's easier to implement out the door, and I think if I client wants to mix and match falliability, they should explicitly unwrap for clarity. A sneaky missing ?
is far too easy to miss.
The existing methods would require an infallibile allocator, but the default would be the global one with an infallibile wrapper so no compat is broken.
I've always been a fan of the type-based approach with associated return types. I think it's easier to implement out the door
This is objectively incorrect, as it's literally impossible to implement today, even on nightly.
Edit: to be clear, such a design is blocked on generic associated types, which might be implemented and on stable in late 2018.
@Gankro you mean something like https://github.com/rust-lang/rust/issues/29802#issuecomment-317114078 ?
Yeah I don't want to do that. I want to add an associated error type to the allocator trait, make it !
in the infallibility wrapper, and then "unwrap" Result<T, !>
in the infallible methods.
Oh ok, well that one just won't ever be implemented, so it's even less possible to implement today?
Huh? Lets not confused not wanting to do something with the impossibility of doing something. This is the main benefit of the associated error type, so it's good to consider together.
Sorry do you mean having (on Vec<T, Alloc>
):
fn push(&mut self) -> Result<(), Alloc::Err> {
if cap == len { self.alloc.reserve(1)? }
...
}
or
fn push(&mut self) {
if cap == len { self.alloc.reserve(1).unwrap() }
...
}
fn try_push(&mut self) -> Result<(), Alloc::Err> {
if cap == len { self.alloc.reserve(1)? }
...
}
Ah, sorry if I was unclear before. Here's a more worked out example:
//!
//! in alloc or collections, based on how that shakes out
//!
struct PanicOom<T>(T);
impl<A: Alloc> Alloc for PanicOom<A> {
type Error = !;
// much repetition of
fn something(&mut self) {
self.something().or_else(|_| self.oom())
}
}
// In the future, define and use a:
// trait InfalibleAlloc=Alloc<Error=!>;
//!
//! in collections
//!
impl<T, A: Alloc> Vec<T, A> {
fn try_push(&mut self) -> Result<(), A::Err> {
if cap == len { self.alloc.reserve(1)? }
...
}
}
// in the future, use `InfalibleAlloc` for more readable bound
impl<T, A: Alloc<Error=!>> Vec<T, A> {
fn push(&mut self) {
self.try_push().void_unwrap()
}
}
//!
//! In std
//!
// stop gap until we have aliases with params with default args
type HashMap<K, V> = HashMap<K, V, PanicOom<GlobalAlloc>>;
The idea being one either uses only vec.push()
or vec.try_push()
. If they mainly use vec.try_push()
, they can add some manual expect
, explaining why some critical allocation cannot fail. If they mainly uses push
, it would be silly to out-of-the-blue handle an allocation failure when the code at large is not robust against it.
This is an interesting idea, but it seems like a lot of work for a lint that none of the users I've interviewed actually seem interested in.
@Ericson2314 do you intend to use this API? What for? (feel free to contact me on irc/email/twitter if you prefer)
If we're talking about a lint for unhandled allocation failures (non-fallible allocations), I and many others are very interested. Codec libraries (and most other libs) are usually required to gracefully handle any alloc failures.
On Aug 3, 2017 12:28 PM, "Alexis Beingessner" notifications@github.com wrote:
This is an interesting idea, but it seems like a lot of work for a lint that none of the users I've interviewed actually seem interested in.
@Ericson2314 https://github.com/ericson2314 do you intend to use this API? What for?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/rust-lang/rust/issues/29802#issuecomment-320051921, or mute the thread https://github.com/notifications/unsubscribe-auth/AAGln17P3erDaE5NL_lOyO6Wg_Eqha-nks5sUhFWgaJpZM4GhKY0 .
Sorry to be clear: there is a strong desire for such a lint, but this is a lot of work for a weak version of it.
I could have sworn I linked this thread here, but I don't see it, so here's where I discuss various options and what different users were asking for: https://internals.rust-lang.org/t/notes-interviews-on-fallible-collection-allocations/5619/8
In it I describe hooking into the availability lint system that is intended to solve similar problems like "don't use floats". This would be more robust in that it would also verify you don't call methods which e.g. create their own Vec and infallibly allocate.
Well, I'm glad you find it interesting :). This is something I've murmored about for a while, but I guess I made the mistake of never actually writing it down.
Truth by told, my day job is Haskell so I personally will not have the time to be a serious consumer of the allocation APIs any time soon. I hope to fine time to use it in my tiny toy OS, but that's it for now.
My general principle is while I'm sure serious industrial users can get away without any sort of lint, it's a nice thing to have that helps everyone not shoot themselves in the foot and, more importantly, help (no-std, for now) library authors not shoot each other in the foot, so the ecosystem can better trust itself.
@aturon mentions the portability lint, but I think that's a rather heavyweight solution for dealing with a pure-Rust interface (as opposed to however allocation is implemented). In particular, no one has proposed a modular way just yet for using it with user-defined CFG tokens. Now, cargo features would be the natural solution, and I suppose we could add a "oom-ignore"
default feature for alloc/collections, but I rather put the feature cfg just once on Ah oops. We'd need per-method CFGs either way, but it would be easy to forget one without this, whereas the cfg on OomPanic
than every single faux-infalliable method. That seems more maintainable.OomPanic
will force us to not forget to also cfg the method (in conjunction with the portability lint).
[edited a bunch, a hard habit to kick.]
I was working on a Rust image library, which has code like:
This code should really be checking for overflow on the multiplications. But doing so only eliminates one class of problems with this code: it's still reasonable for a maliciously crafted image to have large
self.width
andself.height
values whose product doesn't overflowusize
and yet the amount of memory can't be allocated. (I discovered this through an image test suite that has images with...large widths and heights that ought to return errors, but panic'd in Rust.)Looking through the documentation, I didn't see any way of avoiding this panic-on-allocation failure, either at vector creation, or when trying to append elements to a vector.