rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
95.11k stars 12.27k forks source link

Tracking Issue for `new_range_api` (part of RFC 3550) #125687

Open pitaj opened 1 month ago

pitaj commented 1 month ago

Feature gate: #![feature(new_range_api)]

This is a tracking issue for the library additions that are part of RFC 3550: New range types

Tracking issue for the language change: #123741

This feature introduces new types implementing Copy and IntoIterator that are meant to replace the legacy range types, which are !Copy and implement Iterator directly.

Public API

// core::range

#[derive(Clone, Copy, Default, PartialEq, Eq, Hash)]
pub struct Range<Idx> {
    pub start: Idx,
    pub end: Idx,
}
#[derive(Clone, Copy, Default, PartialEq, Eq, Hash)]
pub struct RangeInclusive<Idx> {
    pub start: Idx,
    pub end: Idx,
}
#[derive(Clone, Copy, Default, PartialEq, Eq, Hash)]
pub struct RangeFrom<Idx> {
    pub start: Idx,
}

pub struct IterRange<Idx> { <private fields> }
pub struct IterRangeInclusive { <private fields> }
pub struct IterRangeFrom { <private fields> }

impl<Idx: Step> Iterator for IterRange<Idx> { ... }
impl<Idx: Step> Iterator for IterRangeInclusive<Idx> { ... }
impl<Idx: Step> Iterator for IterRangeFrom<Idx> { ... }

impl<Idx: Step> IntoIterator for Range<Idx> {
    type IntoIter = IterRange<Idx>;
    ...
}
impl<Idx: Step> IntoIterator for RangeInclusive<Idx> {
    type IntoIter = IterRangeInclusive<Idx>;
    ...
}
impl<Idx: Step> IntoIterator for RangeFrom<Idx> {
    type IntoIter = IterRangeFrom<Idx>;
    ...
}

See the RFC for more details.

Steps / History

Unresolved Questions

joshtriplett commented 1 month ago

Capturing something from a verbal discussion:

In theory, since the new range types aren't iterators directly, and just get turned into iterators, we could automatically handle for i in 10..0 and do the expected thing. However, @eholk pointed out that while that'd be intuitive for integer literals, it'd be surprising for the general m..n to implicitly do reverse iteration. So, we shouldn't do this.

joshtriplett commented 1 month ago

Should RangeFrom even implement IntoIterator, or should it require an explicit .iter() call? Using it as an iterator can be a footgun, usually people want start..=MAX instead. Also, it is inconsistent with RangeTo, which doesn't implement IntoIterator either. But, it's extremely useful for it.zip(1..).

We discussed this in today's @rust-lang/libs-api meeting, and our conclusion was: the usefulness of .zip(n..) is high enough that we should make this work. We should ensure that it yields values up to the maximum integer value, and then panics if asked for another value.

Amanieu commented 1 month ago

We discussed the unresolved questions in the libs-api meeting yesterday.

There was a lot of opposition to inherent methods, especially map and rev, because they would need to behave differently on a value type than on an iterator. map would be expected to work like Option::map and apply a function to the start and end bounds. rev would be expected to return a range (not an iterator).

However there was some support for adding an iter inherent method which would be a short-hand for .clone().into_iter(). We felt that the additional 5 characters of into_iter (especially the underscore) would make people's code too verbose for such a common operation.

  • The exact module paths and type names

    • Should the new types live at std::ops::range:: instead?
    • IterRange, IterRangeInclusive or just Iter, IterInclusive? Or RangeIter, RangeInclusiveIter, ...?
  • Should other range-related items (like RangeBounds) also be moved under the range module?

We were mostly happy with how things are currently set out in the PR.

  • Should RangeFrom even implement IntoIterator, or should it require an explicit .iter() call? Using it as an iterator can be a footgun, usually people want start..=MAX instead. Also, it is inconsistent with RangeTo, which doesn't implement IntoIterator either. But, it's extremely useful for it.zip(1..)

It should implement IntoIterator since, as mentioned, it is useful for things like it.zip(1..).

We also discussed the RangeFrom iterator: the implementation should add a hidden bool so that any overflow panics happen after returning the maximum integer value, not before. To avoid issues with LLVM's loop optimizations, we may want this bool to only be used in debug builds: in release builds it may acceptable to allow the iterator to wrap.

  • Should there be a way to get an iterator that modifies the range in place, rather than taking the range by value? That would allow things like range.by_ref().next().

This should be deferred for a future API proposal, only if it turns out there is a need for this.

  • Should there be an infallible conversion from legacy to new RangeInclusive?

Yes there should be. The case of converting an exhausted RangeInclusive iterator should be extremely rare in practice, so it's fine to panic in that case.

pitaj commented 1 month ago

There was a lot of opposition to inherent methods, especially map and rev, because they would need to behave differently on a value type than on an iterator.

I understand the arguments for purity here, but the immense utility these offer should be taken into account. I'll remove them from the PR, but I think the question should be reconsidered before the edition change is stabilized.

Arguments for `map` and `rev` I think range types are in a very unique position, here. `map` and `rev` are 1. commonly used in the wild 2. present in much documentation 3. demonstrated in countless learning materials 4. a useful entry into iterator chains While shaving five characters and the shift necessary for the underscore is helpful, the remaining seven characters and two shifts for `.iter()` is still a burden and needless visual noise. My position is that there is no other meaning we could assign to `map` that would be acceptable given the history of using it in this position. ### `map` is expected to map the bounds, not the yielded elements 1. It is already a widely used pattern 2. Changing `map` to mean "map the bounds" would be a sizeable hazard, because it could silently change the meaning of `for (1..5).map(|n| n*2) { ... }` across an edition boundary. 3. So, a function doing so would need to be named something like `map_bounds` anyways. 4. `Range` is not like `Option` or `Vec`. It's not commonly used to hold values, it's purpose is to generate values. ### `rev` is expected to return something like `RevRange` IMO this argument is stronger, because `for x in (1..5).rev()` would still work. That covers one of the most common use-cases. However, I don't see much use for a `RevRange` type. What would it implement besides `IntoIterator`? - not `RangeBounds` - probably not `Index` - could have `is_empty` and `contains` but those are direction-agnostic It it only serves as an iterable, we may as well just use `Rev>` anyways.