Open vandenheuvel opened 3 years ago
Do you also want the Rust compiler to be (optionally) more verbose regarding the optimizations it performs and the ones it fails to perform on the code?
Is it even possible to build such a feature? Can the compiler determine which optimizations would be possible, but fail to apply them?
Perhaps the best way to go is to create a lint that warns for "missing" implementations.
This isn't specific to enums. A major reason for for_each
's existence (which currently is implemented in terms of fold
) is better performance for some iterator sources.
The fold documentation has an explicit note explaining this: https://doc.rust-lang.org/std/iter/trait.Iterator.html#note-to-implementors-1
I think the pattern in question is not enums, but something like this:
impl Iterator for $Type {
fn next(&mut self) -> Self::Item {
// If `$condition` does not mutate `self`, then it will be the same
// with every `next` invocation. (critical assumption)
// `if $condition` may also be `match $condition` (more likely for enums)
if $condition {
self.x.next() // delegate to an inner iterator
} else {
self.y.next() // delegate to a different inner iterator
} // may be more conditions and inner iterators `else if ...`
}
}
It is quite possible for impl Iterator for EnumType
to not follow this pattern, and the optimization would not apply.
In any case, I'm pretty sure it would be unsound for rust to magically adjust the semantics of fold
etc. for these cases. But a Clippy lint is plausible.
Update: it seems that on the latest nightly, the compiler actually produces equivalent assembler for the original example! In the more general case, a lint might still be useful, though.
Almost all implementations of
Iterator::next
for enum types will at some point branch based on the variant of the current value. When that variant does not change during iteration, it is beneficial to optimize usages ofnext
to avoid repeatedly branching.Consider this example. The right snipped is equal to the left snipped except for the added fold implementation. Without that extra fold implementation, the resulting code is quite inefficient in comparison to the
goal
function.For a human being, this situation is quite easy to reason about. Would it be possible to include an optimization for this situation?