rust-lang / rust-analyzer

A Rust compiler front-end for IDEs
https://rust-analyzer.github.io/
Apache License 2.0
14.36k stars 1.61k forks source link

Deny usages of `HashMap::iter`, `HashMap::into_iter`, `HashMap::values`, etc #18358

Open Veykril opened 1 month ago

Veykril commented 1 month ago

A discussion issue, as we all know, iteration order for these should not be relied on as that can cause bugs (and unstable tests) so I've been wondering, we could forbid the use of these via clippy (but allow them for IndexMap), requiring to annotate usages with #[allow(...)] to make their use very explicit for cases where we know they are not problematic. Alternatively we could add an extension trait that re-exposes them again as well as "sorted"/"stable" variants, that way we know exactly where we might be accidentally relying on the order. Reason I am raising this is https://github.com/rust-lang/rust-analyzer/pull/17954 which should some problematic tests.

Thoughts on this? https://rust-lang.zulipchat.com/#narrow/channel/185405-t-compiler.2Frust-analyzer/topic/Deny.20usages.20of.20.60HashMap.3A.3Aiter.60.2C.20.60HashMap.3A.3Ainto_iter.60.2C.20etc

davidbarsky commented 1 month ago

I'm very much in favor of a blanket deny and a case-by-case #[allow(...)] with a comment explaining why this is desirable behavior. I think this would be a necessary (but not sufficient!) step to getting to supporting (near) perfect replayability: https://github.com/rust-lang/rust-analyzer/issues/10086

Veykril commented 1 month ago

Okay so we agree on denying, I think I'd prefer an extension trait though to opt-in to iteration methods (as that makes it easier to search for references for relevant things) instead of #[allow]ing.

So I imagine traits like


pub trait HashMapIterate<K, V> {
    fn into_iter(self) -> impl Iterator<Item = (K, V)>;
    fn iter<'slf>(&'slf self) -> impl Iterator<Item = (&'slf K, &'slf V)>
    where
        K: 'slf,
        V: 'slf;

    fn values<'slf>(&'slf self) -> impl Iterator<Item = &'slf V>
    where
        V: 'slf;
    fn keys<'slf>(&'slf self) -> impl Iterator<Item = &'slf K>
    where
        K: 'slf;
}
pub trait HashMapIterateStable<K, V> {
    fn into_iter_stable(self) -> impl Iterator<Item = (K, V)>;
    fn iter_stable<'slf>(&'slf self) -> impl Iterator<Item = (&'slf K, &'slf V)>
    where
        K: 'slf,
        V: 'slf;
    fn keys_stable<'slf>(&'slf self) -> impl Iterator<Item = &'slf K>
    where
        K: 'slf;
    fn values_stable<'slf>(&'slf self) -> impl Iterator<Item = &'slf V>
    where
        V: 'slf;
}
impl<K, V, S> HashMapIterate<K, V> for std::collections::HashMap<K, V, S> {
    fn into_iter(self) -> impl Iterator<Item = (K, V)> {
        IntoIterator::into_iter(self)
    }
    fn iter<'slf>(&'slf self) -> impl Iterator<Item = (&'slf K, &'slf V)>
    where
        K: 'slf,
        V: 'slf,
    {
        IntoIterator::into_iter(self)
    }

    fn values<'slf>(&'slf self) -> impl Iterator<Item = &'slf V>
    where
        V: 'slf,
    {
        self.values()
    }
    fn keys<'slf>(&'slf self) -> impl Iterator<Item = &'slf K>
    where
        K: 'slf,
    {
        self.keys()
    }
}
impl<K: Ord, V, S> HashMapIterateStable<K, V> for std::collections::HashMap<K, V, S> {
    fn into_iter_stable(self) -> impl Iterator<Item = (K, V)> {
        let mut v: Vec<_> = IntoIterator::into_iter(self).collect();
        v.sort_by(|(k1, _), (k2, _)| k1.cmp(k2));
        v.into_iter()
    }
    fn iter_stable<'slf>(&'slf self) -> impl Iterator<Item = (&'slf K, &'slf V)>
    where
        K: 'slf,
        V: 'slf,
    {
        let mut v: Vec<_> = IntoIterator::into_iter(self).collect();
        v.sort_by_key(|&(k, _)| k);
        v.into_iter()
    }
    fn keys_stable<'slf>(&'slf self) -> impl Iterator<Item = &'slf K>
    where
        K: 'slf,
    {
        let mut v: Vec<_> = self.keys().collect();
        v.sort();
        v.into_iter()
    }
    fn values_stable<'slf>(&'slf self) -> impl Iterator<Item = &'slf V>
    where
        V: 'slf,
    {
        self.iter_stable().map(|(_, v)| v)
    }
}

Whether the split is necessary I'm not sure (was thinking of IndexMap being able to implement the stable one without the K: Ord as its inherently stable iterable, which means we wouldnt need to forbid its methods in the first place I guess?)