ericniebler / range-v3

Range library for C++14/17/20, basis for C++20's std::ranges
Other
4.06k stars 437 forks source link

ranges::remove doesn't work with ranges::views::chunk #1760

Closed oschonrock closed 1 year ago

oschonrock commented 1 year ago

I can see that this would be non-trivial, but it seems fundamentally possibly to me?


#include "range/v3/algorithm/remove.hpp"
#include "range/v3/view/chunk.hpp"
#include <vector>

int main() {

    std::vector<int> v{
        1, 2, 3, 4,
        5, 6, 7, 8, 
        9, 10, 11, 12,
        13, 14, 15, 16
    };

    auto chunked = ranges::views::chunk(v, 4);
    auto it = ranges::remove(chunked, 9, [](const auto& e) { return e[0]; });

    // expected result  (xx = unspecified)
    // std::vector<int> v{
    //     1, 2, 3, 4,
    //     5, 6, 7, 8, 
    //     13, 14, 15, 16,
    //     xx, xx, xx, xx
    // };
    // and it pointing at chunked[3] (ie the row of xx)  
}

currently I get concept errors "not permutable".. etc..

brevzin commented 1 year ago

This is actually an interesting question.

What remove does is it permutes the input range - we basically do this (your remove with a projection is basically the same thing as calling remove_if with the unary predicate [](auto const& elem) { return elem[0] == 9; })

template <class ForwardIt, class UnaryPredicate>
ForwardIt remove_if(ForwardIt first, ForwardIt last, UnaryPredicate p)
{
    first = std::find_if(first, last, p);
    if (first != last) {
        for(ForwardIt i = first; ++i != last; ) {
            if (!p(*i)) {
                *first++ = std::move(*i); // <==
            }
        }
    }
    return first;
}

The interesting line here is the one I marked, the one that assigns. Ranges doesn't check if that line, syntactically, is valid - we need more than that because of https://github.com/ericniebler/range-v3/issues/573: basically, there are types for which that line compiles but does actually nothing.

Indeed, the exact example here is one such type. The views::chunk gives you a range of prvalue views (that happen to be calls to views::take), and assigning to those prvalue views would compile but not actually change anything about the underlying state of the range. In that sense, I think it's better than the ranges::remove call fails than for it to have compiled and spent cycles doing absolutely nothing.

But what's particularly interesting is that there are proxy reference types for which this could work. For instance, if instead of chunk(v, 4) we did something like adjacent_chunk<4>(v) (similar to the views::adjacent<N> we now have in C++23). In that case, instead of getting a range of views::take objects that are ranges of int& all of whose size is 4, we'd get a range of tuple<int&, int&, int&, int&>. And that type is actually indirectly writable (rvalue assignment does assign through the tuple which would change the underlying contents) and this would compile with ranges::remove and it would do the right thing.

It's also possible to write a different algorithm (maybe named remove_through or something?) which would do the right thing in this case - which would involve writing through every element of *first (rather than writing to *first).

It's also possible to accomplish the removal by adding a views::filter (or views::remove_if) to achieve the same end.

Either way, ranges::remove not working in this case is correct.

oschonrock commented 1 year ago

Thanks for the explanation. Makes sense.

The actual application requires this slightly more complex case of an additional layer of zip with the removal controlled by the parallel zip range. I have included the actual erase step as well.


#include "range/v3/algorithm/remove.hpp"
#include <iostream>
#include "range/v3/view/chunk.hpp"
#include "range/v3/view/zip.hpp"
#include <vector>

int main() {

    std::vector<int> v{
        1, 2, 3, 4,
        5, 6, 7, 8, 
        9, 10, 11, 12,
        13, 14, 15, 16
    };

    auto chunked = ranges::views::chunk(v, 4);

    std::vector<int> b{10, 20, 30, 40};

    auto zipped = ranges::views::zip(chunked, b);

    auto it = ranges::remove(zipped, 30, [](const auto& e) { return e.second; });

    auto relidx = it - zipped.begin();
    v.erase(v.begin() + relidx * 4, v.end());
    b.erase(b.begin() + relidx, b.end());

}

So you can see that the reason for using chunk, is so that the v can have the same dimensionality as the parallel b.

Could any of the strategies you describe achieve that? The one that looks the most promising is writing a remove_through? (not sure I would quite know how to do that)

Or, falling back to basics and just writing a custom adapter class which owns v but presents is as 2D array with iterators to make it range compatible and then feeding that straight to zip | remove => erase.

I think c++23's mdspan would be close to what that custom class needs to be (without owning).