rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
97.24k stars 12.57k forks source link

`Vec.resize()`, `Vec.resize_with()` should be marked as panicking if the new capacity exceeds `isize::MAX` bytes #117437

Open LikeLakers2 opened 11 months ago

LikeLakers2 commented 11 months ago

Location

In the documentation for the following functions:

I noted some trait impls where they might want to be added (such as the Extend trait impls on Vec), but I'm unsure if it makes sense for the "Panics" notes to be added to them.

Summary

These should be marked with the following:

# Panics

Panics if the new capacity exceeds `isize::MAX` bytes.

as they all, in one way or another, call Vec.reserve(), which itself will panic if the new capacity exceeds isize::MAX bytes.

Because other functions that call Vec.reserve() (such as Vec.push() or Vec::with_capacity()) are marked with this "Panics" note, not marking these could result in confusion - namely, the user might think that these functions somehow allow bypassing the isize::MAX byte limit.

the8472 commented 11 months ago

Because other functions that call Vec.reserve() (such as Vec.push() or Vec::with_capacity()) are marked with this "Panics" note, not marking these could result in confusion - namely, the user might think that these functions somehow allow bypassing the isize::MAX byte limit.

As a general principle I discourage such negative reasoning. Because it sets terrible incentives for documentation authors. If we supported this kind of reading then adding a new warning about behavior somewhere would then implicitly require this warning to be added everywhere it applies because the absence-of-warning could otherwise be used to infer that it doesn't apply. And to avoid such an inference-cascade then authors would not add warnings anywhere.

APIs should generally only be read as guaranteeing the behavior that's explicitly stated. Additional behavior beyond that may be best-effort, required to uphold guarantees specified somewhere else, or an implementation-necessity; but not a contract on which the user should build their software.

I'm not opposed to adding the panic note, but I think it would be better to link to the general invariants of allocations in the Vec struct comment to make a general statement that applies to all its methods on top rather than doing it for every individual method.

LikeLakers2 commented 11 months ago

@the8472 I don't think this is a matter of "not every possible panic is documented", as it is an issue of inconsistency among the documentation of Vec.

As it stands, Vec has a precedent set among many pieces of its documentation: if a function can increase the allocated capacity, then it gets that panic note, as it's possible the user may increase capacity beyond isize::MAX bytes.

Thus, since resize and resize_with can increase the allocated capacity, they too should have that panic note.

(As for making a general statement instead of putting that statement on every individual method - I have no opinion either way.)

the8472 commented 11 months ago

I don't see "consistency" as a good argument. You should look deeper why you want something. Consistency is a proxy for some poorly understood heuristic. If your heuristic is negative reasoning then it is likely a source of problems when it comes to interpreting API contracts. If you have some other reason then it may be entirely sensible.

if a function can increase the allocated capacity, then it gets that panic note, as it's possible the user may increase capacity beyond isize::MAX bytes.

Sure, informing the user that a method may panic can be useful if they want to write panic-free code.

But that is quite different from your originally stated motivation which said that the user can draw the inference that the method can be used to bypass the limitation present in other methods.

LikeLakers2 commented 11 months ago

But that is quite different from your originally stated motivation which said that the user can draw the inference that the method can be used to bypass the limitation present in other methods.

It is not.

As a developer, I have to draw inferences all the time about library functions, lest I be forced to dive deep into their code and spend precious time that I could be spending coding. The least of these inferences is that the documentation is consistent - that if the documentation does something in one place, it probably does it everywhere else too.

In this case, the consistent thing I noted is that, where a function can increase a Vec's capacity, it usually will say that it will panic if the new capacity exceeds isize::MAX bytes. When I saw that Vec.resize and Vec.resize_with did not have this same note, I drew an inference: In some way, Vec.resize and Vec.resize_with were not subject to those same capacity limits. I later found my inference to be wrong, and in response, filed this issue so that the documentation can be fixed - so others don't make the same wrong inference I did.

If you wish to call that an issue with understanding API contracts, then go ahead. But I really think you're missing the point here - that consistency across documentation is expected, especially among the core Rust libraries like std.

the8472 commented 11 months ago

Consistency is not mechanically enforced, so such expectations will frequently run aground. And different people have different assumptions what needs to be spelled out again and again. The isize::MAX restriction is a fundamental restriction of all type layouts in rust. So ideally we would teach this not as a property of individual methods. The documentation is basically just repeating the fundamental thing.

For example one usually does not document that calling a method can cause panics due to stack overflows. Except someone may have left a note on a method that is especially prone to doing so due to a recursive implementation or large stack allocations. The absence of such a note in other places doesn't mean those places don't imply they don't take up stack space.

LikeLakers2 commented 11 months ago

Oh my god... I just wanted to open an issue to ensure the documentation stayed consistent within itself, not discuss the overarching implications of memory allocations as a whole.

All I'm asking for is a few additional doc-comments saying that it could crash under those circumstances, so it's consistent with the rest of Vec. I couldn't care less if the crash is due to type layouts or if it's an arbitrary restriction that has no basis in reality.

But if being consistent is a problem, I really don't know what to tell you. I'm not you, I guess?

LikeLakers2 commented 11 months ago

I apologize for my outburst above.

That said, I do just want the documentation to be consistent. Where many pieces of std's documentation do something, I would hope all of them would do the same. Whether that consistency consists of the same accurate comment 30 times over, or one generalized statement that applies to all of Vec, I don't really care - I just want to ensure that the documentation stays consistent, as documentation that is inconsistent only really invites the reader to be confused.

There isn't anything else to it. It's not "a proxy for a poorly understood heuristic." It's simply that I want the documentation to stay consistent - to apply things such as this panic note uniformly, rather than only applying it part of the time.