rust-lang / libs-team

The home of the library team
Apache License 2.0
123 stars 19 forks source link

Adding the contains method for the iterator trait #408

Closed yonikremer closed 2 months ago

yonikremer commented 2 months ago

Proposal

Problem statement

Currently, checking if an iterator contains a specific element requires the use of any method, which involves more verbose and less intuitive code. Additionally, while the itertools crate provides a contains method, it is inefficient to rely on an external crate for such a fundamental and simple operation. This makes common tasks like searching for an element in an iterator more cumbersome than necessary, reducing code readability and developer efficiency.

Motivating examples or use cases

Example 1: Simplifying Search Logic

Current Approach:

let numbers = vec![1, 2, 3, 4, 5];
let contains_three = numbers.iter().any(|&x| x == 3);

Proposed Approach:

let numbers = vec![1, 2, 3, 4, 5];
let contains_three = numbers.iter().contains(&3);

Example 2: Checking for Characters in Strings

Current Approach:

let chars = "hello".chars();
let contains_h = chars.clone().any(|c| c == 'h');

Proposed Approach:

let chars = "hello".chars();
let contains_h = chars.contains(&'h');

Example 3: Consistency Across Collections

Current Approach:

let numbers = vec![1, 2, 3, 4, 5];
let contains_three_vec = numbers.iter().any(|&x| x == 3);

let set: HashSet<i32> = [1, 2, 3, 4, 5].iter().cloned().collect();
let contains_three_set = set.contains(&3);

Proposed Approach:

let numbers = vec![1, 2, 3, 4, 5];
let contains_three_vec = numbers.iter().contains(&3);

let set: HashSet<i32> = [1, 2, 3, 4, 5].iter().cloned().collect();
let contains_three_set = set.iter().contains(&3);

Solution sketch

The contains method will be added to the Iterator trait. This method will take an item and return true if any item in the iterator matches the given item, and false otherwise.

Example Implementation:

trait Iterator {
    // existing methods...

    fn contains(&mut self, item: Self::Item) -> bool
    where
        Self::Item: Eq,
    {
        for element in self {
            if element == item {
                return true;
            }
        }
        return false;
    }
}

Alternatives

Alternative 1: Continue Using any

The primary alternative is to continue using any method. However, this approach is less readable and more verbose:

let contains_three = numbers.iter().any(|&x| x == 3);

Alternative 2: Use the itertools Crate

Another alternative is to use the itertools crate, which already provides a contains method. However, requiring an external crate for such a simple and fundamental operation is inefficient and unnecessary. Relying on external crates for basic functionality can lead to increased dependencies, larger binaries, and potential compatibility issues.

tgross35 commented 2 months ago

I think this would be a nice simplification to have.

The itertools version uses Borrow rather than just PartialEq (why strict Eq in this proposal?) https://docs.rs/itertools/latest/itertools/trait.Itertools.html#method.contains. I think that we would probably want to do the same, which is also used for other collections ({Hash,BTree}{Map,Set}). slice::contains uses PartialEq, but it means that you cannot slice_of_owned_string.contains(some_borrowed_str), which is very annoying.

Ideally we would fix slice::contains too but I don't think we can...

(could you add rust to your code blocks for syntax highlighting?)

kennytm commented 2 months ago

Since RangeBounds::contains also existed this new method on Iterator will cause inference ambiguity for legacy ranges. But I guess it does not really matter, when a Range/RangeFrom/RangeInclusive implemented both Iterator and RangeBounds their contains()s' results should be equivalent. Edit: actually the ranges have inherent impl of the contains() functions which always take precedence so neither trait method would be chosen.

Amanieu commented 2 months ago

We discussed this in the libs-api meeting. We're happy to have this as unstable for now, though the team was split over the usefulness of this compared to just directing users to use .any instead.

We did discuss the specific signature and think this signature would work better in practice since it allows for things like comparing str and String.

fn contains<Q: ?Sized>(&mut self, item: &Q) -> bool
where
    Self: Sized
    Self::Item: PartialEq<Q>,
the8472 commented 2 months ago

Probably not a good idea but since closures and function items don't implement PartialEq we theoretically could add a trait Predicate and then impl Predicate for P where P: FnMut(&Self::Item) -> bool and impl Predicate for &T where T: PartialEq<&Self::Item>

then any and find could also take a &T.

kennytm commented 2 months ago

since closures and function items don't implement PartialEq we theoretically could add a trait Predicate

Function pointers do. Besides, your PartialEq is against the iterator's Item rather than the function itself, making it simple to create a concrete overlapping instance:

struct X;
impl PartialEq<X> for fn(&X) -> bool {
    fn eq(&self, _: &X) -> bool {
        false
    }
}

fn check_fn_mut<I>(_: impl FnMut(&I) -> bool) {}
fn check_partial_eq<I>(_: &impl PartialEq<I>) {}

fn main() {
    let g: fn(&X) -> bool = |_| true;
    check_fn_mut::<X>(&g);
    check_partial_eq::<X>(&g);
}
yonikremer commented 2 months ago

I have a working implementation that passes all the test cases I wrote. Here is it:

    /// Checks if the Iterator contains the value.
    /// 'contains' is short-circuiting; in other words, it will stop processing
    /// as soon as the closure returns `true`.
    ///
    /// Performance:
    /// This method checks the whole iterator, which takes O(n) time.
    /// If the iterator is sorted, or a hash map please use the appropriate method.
    ///
    /// Example:
    /// ```
    /// #![feature(contains)]
    /// assert!(![1i32, 2i32, 3i32].iter().contain(&4i32));
    /// assert!([Some(2i32), Option::<i32>::None].iter().contain(&None));
    /// assert!([Some(2i32), Option::<i32>::None].iter().contain(&Some(2i32)));
    /// assert!(!Vec::<i32>::new().iter().contain(&1i32));
    /// assert!([1i32, 2i32, 2i32, 3i32].iter().contain(&2i32));
    /// #[derive(PartialEq)]
    /// struct Item {
    ///     value: i32,
    /// }
    /// assert!([Item { value: 1i32 }, Item { value: 2i32 }].iter().contain(&Item { value: 2i32 }));
    /// assert!(["a", "b", "c"].iter().contain(&"b".to_owned()));
    /// assert!(!["a", "b", "c"].iter().contain(&"d".to_owned()));
    /// assert!(["a", "b", "c"].iter().contain(&"b"));
    /// assert!(!["a", "b", "c"].iter().contain(&"d"));
    /// assert!(["a".to_owned(), "b".to_owned(), "c".to_owned()].iter().contain(&"b"));
    /// assert!(!["a".to_owned(), "b".to_owned(), "c".to_owned()].iter().contain(&"d"));
    /// assert!((1..1000).contain(500i32));
    /// ```
    ///
    #[unstable(feature = "contains", reason = "new API", issue = "127494")]
    fn contain<Q: ?Sized>(&mut self, item: Q) -> bool
    where
        Q: PartialEq<Self::Item>,
        Self: Sized,
    {
        for element in self {
            if item == element {
                return true;
            }
        }
        return false;
    }
}

Should I submit a pull request now?

tgross35 commented 2 months ago

Should I submit a pull request now?

Yes, but please rename the feature iterator_contains (contains is too general).

Also all your tests don't need to be the doctest - keep that simple, everything else can go in library/core/tests/iter.

pitaj commented 2 months ago

And don't forget to create a tracking issue for the iterator_contains feature

tgross35 commented 2 months ago

(https://github.com/rust-lang/rust/issues/127494 exists but the gate name and API need to be updated)

yonikremer commented 2 months ago

@tgross35 I have renamed the function to has_item to avoid breaking code that uses the itertools contains function.

kennytm commented 2 months ago

@yonikremer you shouldn't need to care about itertool's naming given rust-lang/rfcs#3624 will likely be adapted. Just pick the most natural name.