rust-lang / libs-team

The home of the library team
Apache License 2.0
115 stars 18 forks source link

ACP: Avoid the common mistake of `for x in 0u8..` #304

Open crlf0710 opened 9 months ago

crlf0710 commented 9 months ago

Proposal

Problem statement

When a type has finite possible values and total ordering, it is tempting to write this code: for x in SOME_VAL.. {} However this is plain incorrect... It panics with overflow in debug mode and loops infinitely in release mode, for integers.

Linting against this pattern is helpful, but it's more meaningful to have a correct fix for the expected behavior.

Motivating examples or use cases

Integers and char types(see rust-lang #111514) and many other types might benefit from this.

Solution sketch

Maybe add a to_inclusive() api to open-ended ranges types, such that: (0u8..).to_inclusive() => (0u8..=255u8)

Alternatives

TO BE ADDED.

Links and related work

What happens now?

This issue contains an API change proposal (or ACP) and is part of the libs-api team feature lifecycle. Once this issue is filed, the libs-api team will review open proposals as capability becomes available. Current response times do not have a clear estimate, but may be up to several months.

Possible responses

The libs team may respond in various different ways. First, the team will consider the problem (this doesn't require any concrete solution or alternatives to have been proposed):

Second, if there's a concrete solution:

pitaj commented 9 months ago

As I understand it, the behavior where RangeFrom<N> overflows at the extreme is a consequence of the range types implementing Iterator directly.

In order to prevent the overflow and stop at the max value, we would need to implement a flag similar to RangeInclusive.exhausted. But doing that would increase the size of of RangeFrom by a factor of 2 and would be a breaking change.

So the real solution to this is the same solution to the impl Copy for Range* issue: figure out how to transition the range types from directly implementing Iterator to implementing IntoIterator instead. Then the flag can just live on the iterator type instead of the Range type.

pitaj commented 9 months ago

All that said, it would be good to have a lint for this. Please open a clippy issue for that.

the8472 commented 9 months ago

There's nothing particularly wrong with them if you expect to find a value in that range and then break or return. Using RangeInclusive would be worse for performance.

pitaj commented 9 months ago

Well, one problem is that the iterator can only ever yield 0..=MAX-1 because to yield the maximum it needs to increment the iterator state past it.

So the lint can actually just recommend 0..MAX, no need for inclusive.

okaneco commented 9 months ago

As another potential motivation, a bug was introduced to the image-gif crate because of an open-ended range 2 years ago. It's very easy to hit this with u8. https://github.com/image-rs/image-gif/commit/0215cf2210efb5f5d83cdfaf2ff411b693e031ae

The behavior is confusing to explain. If someone writes the non-panicking for loop, they would expect the collect code to work too but it panics in debug. It's easy to miss code like this because you might only run it in --release mode unless you have tests that cover it.

fn panic() {
    let x = (0..).zip([0; 256].iter()).collect::<Vec<(u8, &u8)>>();
    println!("{}", x.len());
}

fn no_panic() {
    let mut sum = 0u32;
    for (a, _b) in (0..).zip([0; 256].iter()) {
        sum += a;
    }
    println!("{sum}");
}

(playground link)

Personally, instead of an open-ended range iterator I almost always want (0..=MAX), so I write that to avoid possible panics. If I do use an open-ended range, I make sure to immediately follow with a take or zip with something that I know is shorter.

edit: Sorry if this is off-topic. I realize my post is more about the hazards of using open-ended ranges than the specific for x in 0u8.. pattern.

Rudxain commented 3 months ago

playground link

That's very surprising, to say the least. Thank you for reporting it!

Sorry if this is off-topic. I realize my post is more about the hazards of using open-ended ranges than the specific for x in 0u8.. pattern.

Don't worry. I would dare to say for x in u16.. or even u32 should be avoided too. Long-running "real-time" systems must not panic for something like this

kennytm commented 3 months ago

Long-running "real-time" systems must not panic for something like this

It won't "panic" if you turn off -C overflow-checks. If the "real-time" system accepts a + b potentially panicking then they should have no problem with u8..'s panic. And also vice-versa.

Rudxain commented 3 months ago

If the "real-time" system accepts a + b potentially panicking then they should have no problem with u8..'s panic

Correct! The example I gave was bad. But my intention/opinion is the same: the code is a potential footgun, no matter the int size (although u128 is "extreme enough" to not be concerning, in practice)