boostorg / algorithm

Boost.org algorithm module
http://boost.org/libs/algorithm
Boost Software License 1.0
112 stars 105 forks source link

Implement iterate algorithms #101

Open jgopel opened 2 years ago

jgopel commented 2 years ago

Problem:

Solution:

grisumbras commented 2 years ago

Template parameter names imply that this algorithm operates on a range of output iterators. But there's no such thing. Output iterators aren't required to have equality comparison. The algorithm actually requires a range of forward iterators. Also is it really a major improvement over the following?

std::generate(first, last, [&init, op]{ return init = op(init); })
jgopel commented 2 years ago

Hi @grisumbras - thanks for the feedback! I appreciate your concerns and I'd love to arrive at a design that you think is better 🙂

Template parameter names imply that this algorithm operates on a range of output iterators. But there's no such thing. Output iterators aren't required to have equality comparison.

I used OutputIterator as a name here to follow the pattern of template names used throughout the library where the types for the iterators that are read from is InputIterator (or InputIterator1 and InputIterator2 for the dual range algorithms) and OutputIterator for the iterators that are written to.

copy_if() and its variants as well as transform_inclusive_scan() and its variants demonstrate the pattern I tried to follow well. Looking at the implementation for iota(), which is perhaps the most closely related to this algorithm, one uses ForwardIterator and the other uses OutputIterator. There are also implementations, such as is_palindrome() which use the type name to name the iterator category.

It seems like there is no clear practice that's being followed throughout the codebase. I would argue that the type of iterator should be checked by the interface, not simply held in the name. To that end - does a boost wrapper around the iterator concepts exist? They could be added as constraints and checked where available.

Also is it really a major improvement over the following?

I'll start by stating that that's a matter of opinion, but I would argue that it is an improvement. This library explicitly targets C++03 where lambdas aren't available - it's a clear win over writing a function object in that case. Even in standards where lambdas exist though, for me this API more clearly captures intent than generate. Perhaps an improvement here would be to give it a better name - perhaps what I've called iterate is actually the stateful case of generate?

elbeno commented 2 years ago

A few observations:

The question "is iterate preferable to generate?" has a similar flavour to the question "is accumulate preferable to for_each?" In both cases the same effect as the "dedicated" function can be achieved with the "more general" function and stateful lambda expressions. But I consider both accumulate and iterate to be important upgrades in intent and in composability. Both also allow a more declarative coding style.

iterate is an unfold, just as accumulate is a fold, and comparing iterate to accumulate (and contrasting with generate/for_each) is useful. iterate wins over generate in a couple of important ways:

One last thing: the name iterate is taken from Haskell. I haven't conducted a survey of other languages to see what the prevailing name of least surprise, if any, would be.

grisumbras commented 2 years ago

It seems like there is no clear practice that's being followed throughout the codebase.

The pattern is evidently that the name of the template parameter type corresponds to the named requirement. And regardless, using OutputIterator for something that is a mutable forward iterator is not ideal.

With regards to the name of the functions. iterate is a very generic name. How about calling it generate, since that's what it appears to be?

jgopel commented 2 years ago

I was misunderstanding the concern on iterator requirements - I've updated that to a state that I think is correct.

since C++20, std::accumulate moves the value through the algorithm; IMO iterate/iterate_n should do the same, although obviously not available before C++11 - presumably there is a boost macro?

I have not been able to find a boost macro for move, unfortunately. If one exists, I'll happily pepper it throughout as I agree that that makes the behavior more flexible.

was there a reason to do n -= 1 rather than --n? IMO decrement is preferred since it's a lower strength operation than generalized subtraction.

No - just a code style I've been trying out. (Roughly, the idea is that if prefix ++ is better than postfix ++ in terms of communication of intent about reuse, then += 1 is even more clear that the purpose of the operation is incrementing and that the result of that is not needed.) I've updated to -- as I agree that it's the right choice in the generic case.

One last thing: the name iterate is taken from Haskell. I haven't conducted a survey of other languages to see what the prevailing name of least surprise, if any, would be.

I did some digging on this and couldn't find any other languages with similar named facilities where it wasn't called iterate. I have reached out to some people for suggestions there - I'll post here if anything comes back.

jgopel commented 2 years ago

An update on the naming:

As far as I can tell, this algorithm is only named in functional language standard libraries. Haskell, Clojure, OCaml, and Scala (and perhaps more, those were the ones that I found) have this algorithm and all of them call it iterate.