ericniebler / range-v3

Range library for C++14/17/20, basis for C++20's std::ranges
Other
4.11k stars 442 forks source link

view::transform's multi-pass guarantee broken using function objects with mutable state #273

Open tivek opened 8 years ago

tivek commented 8 years ago

view::transform returns a view adaptor which stores possibly mutable state (the function object that does the transformation) inside the view object itself instead of the adaptor object. As a consequence, its multi-pass guarantee can be broken:

#include <iostream>
#include <vector>
#include <cassert>
#include <range/v3/all.hpp>

int main()
{
    using namespace ranges;
    std::vector<int> v{0,1,2};

    auto transformed = view::transform(v, [count = 0u](auto& n) mutable {
        return count++;
    }); // this particular view is evidently single-pass
    // however, the following two asserts are triggered

    CONCEPT_ASSERT(SinglePass<range_iterator_t<decltype(transformed)>>()); // :-(

    auto b1 = begin(transformed);
    auto b2 = b1;
    assert(*b1 == *b2); // :-(
}

Of course, we do not want to "fix" the underlying iter_transform_view by having a possibly large copy of the function object inside each resulting iterator. A more suitable solution would be to set iter_transform_view::single_pass to true_type for all non-const function objects.

In the end, a whole class of related bugs can be coaxed out of certain other view_adaptors. As far as I can tell, all of the following store possibly mutable state not within the adaptor but inside the view: adjacent_remove_if_view, remove_if_view, partial_sum_view, and iter_take_while_view could show surprising behavior with multi-pass algorithms.

pfultz2 commented 8 years ago

The multi-pass guarantee can be circumvented by relying on global state as well. A good solution would be to have a view adaptor that can downgrade the range to single pass. Then as a smoke screen the view::transform will only call the const overload of the function object. So the above would have to be written like this:

auto transformed = v 
    | view::transform(std::ref([count = 0u](auto& n) mutable { return count++; })) 
    | view::single_pass;
CaseyCarter commented 8 years ago

transformed is not even a valid InputRange in these examples. assert(*b1 == *b1) fires, contradicting the equality-preservation requirement for * imposed by Readable. We seem to be missing a requirement that the transformation function must be regular.

tivek commented 8 years ago

To be honest I find a lot of use in being able to pass stateful function objects to eg. transform or remove_if. For instance, I am using them to elegantly make lazy versions of set_intersect and set_difference. Wouldn't it be more interesting long-term to make those views smarter and set their traits accordingly instead of restricting the transformation functions and predicates?

CaseyCarter commented 8 years ago

"is a mutable function object" is not in general the inverse of "computes a regular function" - they are orthogonal attributes. We can't assume that mutable functions are not regular, or that immutable functions are regular. The implementation of view::transform in rv3 implicitly assumes that the transformation function is regular, and hence (a) transformed values can be recalculated as necessary without need for memoization, and (b) the resulting view is multi-pass when the underlying range is multi-pass.

If we want a view that does not require a regular transform function - and I agree that there are valid use cases for such a view - it will need to have a different name since regularity of functions isn't a syntactically distinguishable feature. That view will be necessarily single-pass and will need to cache the most recently transformed value much like view::generate does.

CaseyCarter commented 8 years ago

Since the transform algorithm doesn't require a regular function, I'm inclined to think that view::transform should be the name of the single-pass view, and the multi-pass view that requires a regular transformation should be changed.

it will need to have a different name

...or some other syntactic indication, I suppose, e.g. view::transform(multipass_tag, ...).

pfultz2 commented 8 years ago

We can't assume that mutable functions are not regular, or that immutable functions are regular.

Yes, but since function objects are passed by value and there is no specification how many times this object is coped between function calls, its best to always wrap mutable functions in std::ref. Restricting the calls to only const overloads helps the user avoid these problems.

That view will be necessarily single-pass and will need to cache the most recently transformed value much like view::generate does.

Sounds like the memoized view from Pstade.Oven.

Since the transform algorithm doesn't require a regular function, I'm inclined to think that view::transform should be the name of the single-pass view, and the multi-pass view that requires a regular transformation should be changed.

I don't see why we need two views of view::transform. The view::transform can continue to work with multi-pass ranges, and then when the function has side effects a view::memoize could be used.(eg view::transform(f_with_side_effects) | view::memoize) Perhaps, a tag could be passed as well to indicate whether it could be multi-pass or single pass(eg view::memoize(singlepass_tag) or view::memoize(multipass_tag)). This would help fulfil the requirements of Readable for an iterator.

ericniebler commented 8 years ago

The memoize suggestion sounds like the right approach to me.

CaseyCarter commented 8 years ago

If f is not regular, then view::transform(f) returns an object whose member begin returns a type I that meets-the-syntactic-requirements-of-but-does-not-satisfy InputIterator despite having an iterator category that derives from input_iterator_tag. Consequently I does not satisfy Iterator and view::transform(f) does not satisfy Range. It seems hackish to me to have view::foo sometimes return a non-Range that is composable into a valid Range. How would we specify the requirements of view::memoize to enable composition with this particular non-View?

ericniebler commented 8 years ago

Grr, you're right. :-(

ericniebler commented 8 years ago

Since the transform algorithm doesn't require a regular function, I'm inclined to think that view::transform should be the name of the single-pass view, and the multi-pass view that requires a regular transformation should be changed.

My gut tells me otherwise. Using a mutable function objection with view::transform seems to be working against the pure functional paradigm of views. I would rather see view::transform(fun) assume fun is pure, and have something like view::transform(stateful(fun)) do something different.

ericniebler commented 8 years ago

"is a mutable function object" is not in general the inverse of "computes a regular function" - they are orthogonal attributes. We can't assume that mutable functions are not regular, or that immutable functions are regular.

Although this is absolutely true, the transform view must decide how to call the function. It's possible that a pathological function object overloads on const and gives different results depending. How it uses const on the invocation of the function object matters and must be documented. And along that vein, it seems reasonable to me that a function (object) that cannot be called when it's const can be assumed to be non-regular, so long as we give users a way to override that.

pfultz2 commented 8 years ago

And along that vein, it seems reasonable to me that a function (object) that cannot be called when it's const can be assumed to be non-regular, so long as we give users a way to override that.

Exactly. In fact mutable overloads can be called from a const overload by using std::ref or a mutable member variable(this is how fit::mutable_ works). Having an extra step the user needs to go through help ensure that the user really knows what they are doing.

I would rather see view::transform(fun) assume fun is pure, and have something like view::transform(stateful(fun)) do something different.

Having a way to annotate a function as stateful seems very useful.

ericniebler commented 8 years ago

Having a way to annotate a function as stateful seems very useful.

This leads me to believe that there should be an is_regular_function trait, and that the RegularFunction concept should use it to syntactically differentiate itself from Function. @CaseyCarter?

CaseyCarter commented 8 years ago

This leads me to believe that there should be an is_regular_functionis_regular_callable trait, and that the RegularFunctionRegularCallable concept should use it to syntactically differentiate itself from FunctionCallable.

My initial reaction is that using type traits with lambdas would not provide an enjoyable user experience. Using type traits with function pointers seems unfeasible and/or a recipe for ODR violations: what is is_regular_callable<int(*)(float, char)>()? I think we should stick with differentiation at the callsite.

ericniebler commented 8 years ago

I'm thinking about how differentiation at the call site is implemented. stateful(foo) returns an object. How does the system know that it is non regular? I'm suggesting a trait. What is your suggestion? On Jan 20, 2016 10:53 AM, "Casey Carter" notifications@github.com wrote:

This leads me to believe that there should be an is_regular_function is_regular_callable trait, and that the RegularFunctionRegularCallable concept should use it to syntactically differentiate itself from Function Callable.

My initial reaction is that using type traits with lambdas would not provide an enjoyable user experience. Using type traits with function pointers seems unfeasible: what is is_regular_callable<int(*)(float, char)>()? I think we should stick with differentiation at the callsite.

— Reply to this email directly or view it on GitHub https://github.com/ericniebler/range-v3/issues/273#issuecomment-173323346 .

CaseyCarter commented 8 years ago

Oh, that's a different question. disable_regular_callable<T> (or better disable_regular_callable<T, Args...>) is a very different trait from is_regular_callable<T>. Yes, that seems perfectly sensible.