Closed ericniebler closed 9 years ago
I was thinking of adding tagged_pair
and tagged_tuple
types, like this:
template<class...Tags>
struct tagged_tuple;
template<class Indices, class...Tags>
struct tagged_tuple_accessors;
template<std::size_t...Is, class...Tags, class...Types>
struct tagged_tuple_accessors<index_sequence<Is...>, Tags(Types)...>
: Tags::template accessor<tagged_tuple<Tags(Types)...>, Types, Is>... { };
template<class...Tags, class...Types>
struct tagged_tuple<Tags(Types)...>
: std::tuple<Types...>
, tagged_tuple_accessors<make_index_sequence<sizeof...(Tags)>, Tags(Types)...> {
using std::tuple<Types...>::tuple;
};
Then accessor structs can be defined like this:
struct input_tag {
template<class Derived, class Type, std::size_t I>
struct accessor {
constexpr Type input() const {
return std::get<I>(static_cast<const Derived&>(*this));
}
};
};
struct output_tag {
template<class Derived, class Type, std::size_t I>
struct accessor {
constexpr Type output() const {
return std::get<I>(static_cast<const Derived&>(*this));
}
};
};
You can use the type like this:
tagged_tuple<input_tag(int), output_tag(std::string)> s{42, "hello world"};
std::cout << s.input() << '\n';
std::cout << s.output() << '\n';
std::cout << std::get<0>(s) << '\n';
std::cout << std::get<1>(s) << '\n';
That prints:
42
hello world
42
hello world
It's still a tuple
, so you should be able to tie
, assign to tuple
s, etc. But you have friendly named getters.
I'm undecided, though. It might go too far.
The one of the main goals of introducing concepts and STL2 was to make the error messages from instantiation readable. I am a bit concerned that using such tagged_tuple
as a return type, will reintroduce the problem. Just think of error message generated when someone will accidentally invoke input()
instead of input1()
on result of merge
. And then trying to check what methods are available.
From other side, very nice implementation of tagged tuples.
The error they'll get is "no such member". Code completion in their IDE will suggest the correct members. If they check the return type, they see tagged_tuple<input1(It1), input2(It2), output(Out)>
or something. That's pretty good, IMO.
I must admit that your solution will not make use of the algorithms less user-friendly that just naked tuple and will rather improve it.
Wouldn't use of tagged tuple resolve the problem with differentiating situation when returned pair of iterators is range (like lower_bound
) or points to different ranges (mismatch
). In the first case something like tagged_tuple<range(It, It)>
(would require some magic to support multi element accessors, but should be doable) would be returned and models range. While for the second tagged_tuple<input1(It1), input2(It2)>
would be returned.
tagged_tuple<range(It, It)>
More like tagged_tuple<begin(It), end(It)>
tagged_tuple<begin(It), end(It)>
I proposed tagged_tuple<range(It, It)>
to avoid tagged_tuple<begin(It), output(It)>
etc.
tagged_tuple<begin(It), output(It)>
I don't get it. None of the algorithms would have that return type.
I don't get it. None of the algorithms would have that return type.
I tough that tagged_tuple
will be a general use class, that can be used for other purposes. In general case range(It, It)
is less error prone than begin(It), end(It)
.
I tough that
tagged_tuple
will be a general use class
It is.
In general case
range(It, It)
is less error prone thanbegin(It), end(It)
A struct with begin()
and end()
member functions is error prone? You're using the wrong library if you feel that way. ;-) The nice thing about tagged_tuple<begin(It), end(It)>
is that it's a range. I don't have to do anything further.
A struct with
begin()
andend()
member functions is error prone? You're using the wrong library if you feel that way. ;-) The nice thing abouttagged_tuple<begin(It), end(It)>
is that it's a range. I don't have to do anything further.
Providing separate begin
or only end
is more error prone that range
that declares them in bundle, as the first one allow to declare half backed range (only end used). This has nothing to do with range concepts. In general if we require multiple functions, then I would prefer have single mixing that adds them all, not multiple ones that adds them separately.
Just consider example of counted version of find_if_n
that accepts Iterator
, DifferenceType
and Predicate
and returns pair<Iterator, DifferenceType>
. Would it be able to treat tagged_tuple<begin(It), count(DifferenceType)>
as range? Would it be easier to do so for tagged_tuple<counted_range(It, DifferenceType)>
?
If there were a find_if_n
(there isn't), it should return a pair<Iterator, Difference>
-- possibly with input
and count
accessors. I don't know why you think it should return a counted range.
If there were a find_if_n (there isn't), it should return a
pair<Iterator, Difference>
-- possibly withinput
andcount
accessors. I don't know why you think it should return a counted range.
Because the returned iterator and the count defines a valid range starting at the first element meeting the predicate and ending at the end of the original range. In the same way like the iterator argument and count passed to the function defines a input range.
It the same situation as for rotate
that now returns pair of iterators the new split position and end of the range. We know that this two iterators always denotes a valid range and this fact should be expressed in semantics of returned object.
The find
family of algorithms locates a position within a range. They all return positions -- iterators. As a courtesy, find_n
, if such an algorithm existed, would probably also return the distance to the end, since it needs to compute it anyway, but that doesn't change the fact that it fundamentally returns a position. Returning a range would be inconsistent with the other find
algorithms.
But there is no find_n
algorithm and with the existence of view::counted
and range-based algorithm overloads, there isn't likely to be. Instead, find
with a counted view would return a counted iterator, which is both position and count. That's the model of STL2.
The algorithms don't compose. Trying to make them compose leads to an inconsistent mess. I've tried; it doesn't work.
Returning a range would be inconsistent with the other find algorithms.
The explicitly counted algorithm already have inconsistent signatures. Their accept different arguments and have different result types.
The algorithms don't compose. Trying to make them compose leads to an inconsistent mess. I've tried; it doesn't work.
My point is not about the composability, but about having the return type that conveys as much of semantics information as possible. If pair of the iterators happens to models range, the return type should model range concept (mismatch
vs rotate
). If the iterator and count denotes valid range, then returned type should also model a range (count_if
* versus find_if_n
).
*Why does count_if
return only DifferenceType
and drops the information about the end iterator that's need to be computed anyway?
But there is no find_n algorithm and with the existence of view::counted and range-based algorithm overloads, there isn't likely to be. Instead, find with a counted view would return a counted iterator, which is both position and count. That's the model of STL2.
For the original STL design the user was able to define their own STL-like algorithm. Writing an implementation of the algorithm that would work effectively for both bounded and counted ranges is not easy as explained in one of your blog post (at least that was my reception). So I think we should support the explicit counted version (_n
) of the algorithm if they are not able to create efficient generic version. By this support I mean returning a tuples that includes additional semantics, if we provide such functionality for standard algorithms.
My point is not about the composability, but about having the return type that conveys as much of semantics information as possible. If pair of the iterators happens to models range, the return type should model range concept (mismatch vs rotate). If the iterator and count denotes valid range, then returned type should also model a range (count_if* versus find_if).
*Why does count_if return only DifferenceType and drops the information about the end iterator that's need to be computed anyway?
Because an algorithm does only what it is designed to do and no more. We should not try to alter the design of existing algorithms. Also, in this case, returning and end iterator from count_if would make the algorithm require forward iterators.
Andrew
Also, in this case, returning and end iterator from count_if would make the algorithm require forward iterators.
This is probably wrong place to ask such question. But why returning an iterator from the count would change the requirement to ForwardIterator
. The for_each
algorithm accept InputIterators
and returns the end iterator. count
/count_if
may be implemented as invocation of for_each
with functor that will conditionally increase an internal counter. As the functor is returned from iterator, I can extract the counter value after iteration.
Because an algorithm does only what it is designed to do and no more. We should not try to alter the design of existing algorithms.
I think that this equally applies to for_each
and count_if
, but only one of them was changed.
You can find the rationale in my first range paper. Basically, if the whole point if the algorithm is to compute one value, I opted not to change the return type. My thinking is that it would be too inconvenient and make migration difficult. Nobody uses the return value of for_each, but everybody necessarily uses the return value of count_if.
On Thu, May 28, 2015, 1:02 PM tomaszkam notifications@github.com wrote:
Also, in this case, returning and end iterator from count_if would make the algorithm require forward iterators.
This is probably wrong place to ask such question. But why returning an iterator from the count would change the requirement to ForwardIterator. The for_each algorithm accept InputIterators and returns the end iterator and count_if may be implemented as invocation of for_each with function that will conditionally increase a internal counter.
Because an algorithm does only what it is designed to do and no more. We should not try to alter the design of existing algorithms.
I think that equally applies to for_each and count, but one of them was changed.
— Reply to this email directly or view it on GitHub https://github.com/ericniebler/stl2/issues/23#issuecomment-106581602.
I think I would be better should stick for every case to The Law of Useful Return as defined in From Mathematics to Generic Programming:
If you've already done the work to get some useful result, don't throw it away. Return it to the caller.
A procedure should return all the potentially useful information it computed.
This would avoid making arbitrary decisions, like deciding that changing copy
will be less breaking than count
.
Would it be possible to move this comments to separate issue. We are moving a bit off topic from original issue.
The traditional hack to recover the end iterator from an STL algorithm is to pass reference_wrapper<I>
instead of the actual iterator(s). It would be nice if STL2 would officially endorse this technique for cases like this. reference_wrapper<I>
falls afoul of the exact type constraints for ++WeaklyIncrementable
and Incrementable++
, so it's not as simple as writing specializations for the associated type templates. It would not be challenging to write an iterator_reference_wrapper
or add constrained increment operators to reference_wrapper
so that it could serve this role.
It would be not challenging to write an
iterator_reference_wrapper
or add constrained increment operators toreference_wrapper
so that it could serve this role.
I think that use of iterator_reference_wrapper
that has pointer copy semantics, will break all algorithms that assume that copies of the iterators are independent, i.e. if not touched does not change their positions.
Are the requirements that copies are independent (i.e. modification of the copy does not change source) stated anywhere? From the non normative note from SemiRegular
:
The
Semiregular
concept is modeled by types that have behave similarly to built-in types like int, except that they may not be comparable with==
.
It looks like is independence of copies is required for the Semiregular
(as built-in types provides such semantics), but I am unable to find anything in the wording that will explicitly state that requirement.
Are the requirements that copies are independent (i.e. modification of the copy does not change source) stated anywhere?
It's the difference between WeaklyIncrementable
and Incrementable
that matters in this case. A iterator_reference_wrapper
would necessarily be Input
, I think.
This would avoid making arbitrary decisions, like deciding that changing
copy
will be less breaking thancount
.
You're right, it's arbitrary, and it makes me uncomfortable. But making people do more work to get the result they asked for also makes me uncomfortable. We don't have the luxury of designing STL2 in a bubble ... there is a huge installed base, and ease of migration is a concern.
It's a question that should be put to the committee, certainly.
My point is not about the composability, but about having the return type that conveys as much of semantics information as possible.
OK, this is a fair point. I'll think about this. My gut is telling me that tagged_tuple<counted_range(It, Diff)>
is not the right interface, though.
We don't have copy independence specified anywhere, and we probably should: input/output iterators do not have copy independence, stronger iterators do. I'm sure the algorithms require it, so it needs to be specified at least in ForwardIterator. I imagine most of the permuting algorithms probably require copy independence for the ValueType as well...
Adding copy independence to Regular or Semiregular would break input/output iterators be a good idea.
We don't have copy independence specified anywhere, and we probably should: input/output iterators do not have copy independence, stronger iterators do.
As Eric pointed out, the Incrementable
types requires that increment is equality-preserving. This cannot be achieved if the iterator does not provide copy-independence. If a
is modifies state of b
then ++a
would not be equal ++b
(shared state incremented twice).
Because count* iterates over the entire range. Returning an iterator into the middle of the range requires forward iterators. On May 28, 2015 4:02 PM, "tomaszkam" notifications@github.com wrote:
Also, in this case, returning and end iterator from count_if would make the algorithm require forward iterators.
This is probably wrong place to ask such question. But why returning an iterator from the count would change the requirement to ForwardIterator. The for_each algorithm accept InputIterators and returns the end iterator and count_if may be implemented as invocation of for_each with function that will conditionally increase a internal counter.
Because an algorithm does only what it is designed to do and no more. We should not try to alter the design of existing algorithms.
I think that equally applies to for_each and count, but one of them was changed.
— Reply to this email directly or view it on GitHub https://github.com/ericniebler/stl2/issues/23#issuecomment-106581602.
Because count* iterates over the entire range. Returning an iterator into the middle of the range requires forward iterators.
The count* would necessarily return the end iterator, just as for_each
. And I am asking why it is does not returned, because after addition of Sentinel
concept the value of the end iterator may not be know at the caller side, so it counts as potentially useful information. I do not see how returning iterator to the middle of the range get involved.
Adding copy independence to Regular or Semiregular would break input/output iterators.
If we are not able to iterate multiple times over the range, why do we even require a copy operations from input/output iterator. Eliminating ability to use copy of iterator to store a position and perform second iteration, effectively excludes multiple passes. And we can still move this iterators around using move-operations. Maybe WeaklyIncrementable
types should not be required to be copy constructible and SemiRegular
.
The copy constructor of std::istream_iterator
behaves similarly to std::auto_ptr
- after copy only one of the objects may be used. Maybe similar shift like the one in case of std::unique_ptr
should be also applied here?
You're right, it's arbitrary, and it makes me uncomfortable. But making people do more work to get the result they asked for also makes me uncomfortable. We don't have the luxury of designing STL2 in a bubble ... there is a huge installed base, and ease of migration is a concern.
There is also other side of the problem. Returning the end iterator from every algorithm that needs to compute it anyways, simplifies the invoking STL1 functions in the code that uses STL2. In case when standard algorithm was invoked before, because it eliminates need to use common_iterator
to pass a range defined by Iterator
and Sentinel
, as the user already has an end iterator returned from STL2 algorithm.
Because count* iterates over the entire range. Returning an iterator into the middle of the range requires forward iterators.
The count* would necessarily return the end iterator, just as for_each. And I am asking why it is does not returned, because after addition of Sentinel concept the value of the end iterator may not be know at the caller side, so it counts as potentially. I do not know how returning iterator to the middle of the range get involved.
Oh right... this is generalized for sentinels. I was thinking of the current versions of count*.
I suppose we should aim for consistency here.
That said, I don't like that count doesn't return just a number, since that seems to be the common case. I wonder if we can overload count for iterator pairs to recover the original case (without getting ambiguities).
That said, I don't like that count doesn't return just a number, since that seems to be the common case. I wonder if we can overload count for iterator pairs to recover the original case (without getting ambiguities).
But then the same should be done for copy
and possibly all other algorithms that has changed return type.
I believe that return type should not be affected by category of input type, this both covers traversal (RandomAccess
vs Bidirectional
) and the type of end position (Iterator
versus Sentinel
). Otherwise writing more general algorithms (binary_search
) using basic ones (advance
) would be a harder, because such special cases would need to be detected and addressed.
Maybe an options would be to go with more meaningful return types (tagged_tuple<count(DT), input(It)>
instead of pair<DT, It>
) and provide implicit conversion to the main result (ie. number of occurrences for count
, output iterator for copy
). However I am not sure if this will cause problems for new STL2 code.
provide implicit conversion to the main result (ie. number of occurrences for
count
, output iterator forcopy
)
All my experience as a library developer tells me to avoid this. Implicit conversions are trouble. This subterfuge will not go unnoticed. People will simply assign the result of count
to an auto variable, or pass it to a (constrained?) function template where the type is deduced as a pair or tuple and hell breaks loose.
I wonder if we can overload count for iterator pairs to recover the original case (without getting ambiguities).
You mean, return a plain count when count
is called with iterator/iterator, and a pair<count, end>
when called with iterator/sentinel. This complicates generic code calling count
on a range that was passed in. Now I need to do metaprogramming just to use the return value of an algorithm!
I would prefer structures will more appropriate field names to be returned, but not if the cost of this change is elimination of ability use
std::tie
to capture result, i.e. write:or