fenbf / cppstories-discussions

4 stars 1 forks source link

2021/filter-cpp-containers/ #2

Open utterances-bot opened 3 years ago

utterances-bot commented 3 years ago

12 Different Ways to Filter Containers in Modern C++ - C++ Stories

Do you know how many ways we can implement a filter function in C++? While the problem is relatively easy to understand - take a container, copy elements that match a predicate and the return a new container - it’s good to exercise with the Standard Library and check a few ideas.

https://www.cppstories.com/2021/filter-cpp-containers/

ogrodnikm commented 3 years ago

With ranges v3 views:

  1. filter is cheaper than growing vector and moving elements.

    template <typename T, typename Pred>
    auto FilterWithRanges(const std::vector<T>& vec, Pred p) {
    return vec | ranges::views::filter(p) | ranges::to_vector;
    }
    • invokes filter multiple times on the same object, but in turn, vector is allocated with required size - no growing one by one.
  2. filter costs a lot.

    template <typename T, typename Pred>
    auto FilterWithRanges(const std::vector<T>& vec, Pred p) {
    return vec | ranges::views::filter(p) | ranges::views::cache1 | ranges::to_vector;
    }
    • similar to copy_if, unfortunately adds additional move on m-1 cached elements.

This has advantage of not having to initialize, then copy, then return, so the whole function is kind of redundant. I'd rather see

auto filtered = vec 
    | ranges::views::filter([](auto& elem) { return !elem.starts_with('*'); }) 
    | ranges::to_vector;

than

auto filtered = FilterWithRanges(vec, [](auto& elem) { return !elem.starts_with('*'); });

and wonder what type is filtered and what crazy magic happens inside FilterWithRanges ;) to function has been proposed to C++23, but it'll take some time to get there. Meanwhile this code works fine with C++14. If you're using conan in your project, adding ranges v3 is one line in conanfile and one line in CMake.

danesh110 commented 3 years ago

To detect if a given container supports push_back, it is possible to use concept:

template <typename T> concept has_push_back = requires(T container, typename T::value_type v) { container.push_back(v); };
fenbf commented 3 years ago

thanks, @danesh110 , yep, good point! see updated article :)

StephenJD commented 3 years ago

What is the cleanest way of doing a repeat-search using ranges? This is my offering:

#include <iostream>
#include <vector>
#include <algorithm>
#include <ranges>

using namespace std;
namespace ranges = std::ranges; 

int main()
{
cout << endl << "Ranges version...\n";
auto haystack = vector<int>{ 3,5,2,1,2,11,2,3,1,2,1,2,4,6 };
auto needle = vector<int>{ 2,1,2 };

auto restOfHaystack = ranges::drop_view(haystack, 0);
auto needle_range = ranges::search(restOfHaystack, needle);
while (needle_range.begin() < haystack.end()) {
    auto posInHaystack = distance(haystack.begin(), needle_range.begin());
    cout << "Ranges::Access at: " << posInHaystack << endl;
    restOfHaystack = ranges::drop_view(haystack, posInHaystack+1);
    cout << "Ranges::continue at: " << (*restOfHaystack.begin()) << endl;
    needle_range = ranges::search(restOfHaystack, needle);
}
}
ogrodnikm commented 3 years ago

@StephenJD I don't have a good idea, how to do this with just std ranges. Also, beauty is in the eye of the beholder. With full ranges library I would do this (I'm assuming task is to find positions of needle in haystack):

#include <range/v3/all.hpp>
#include <vector>
#include <iostream>

using namespace ranges::views;

int main()
{
    auto haystack = std::vector<int>{ 3,5,2,1,2,11,2,3,1,2,1,2,4,6 };
    auto needle = std::vector<int>{ 2,1,2 };
    auto access = haystack 
        | sliding(needle.size()) 
        | enumerate
        | filter([&needle](auto const& position_needle) { return ranges::equal(position_needle.second, needle); })
        | transform([](auto const& position_needle) { return position_needle.first; } );
    for (auto a : access)
        std::cout << "Ranges::Access at: " << a << '\n';
}

This shows the problem I have with C++20 range views. They are so incomplete, it is hard to use them in anything more complex than filter/transform or join/split. Also, they are not easily extendable. Ranges v3 provides utilities that makes adding your own views easy. For example, this is my implementation of a view that takes a range of std::optional<T> objects, filters out ones that are null and returns range of T:

inline constexpr auto unpack_optional_fn = [](auto&& range) {
    return range | rv::filter([](const auto& opt) { return opt.has_value(); }) |
           rv::transform([](auto&& opt) { return std::move(*opt); });
};
inline constexpr auto unpack_optional = rg::make_pipeable(unpack_optional_fn)

...not a rocket science ;)

StephenJD commented 3 years ago

Using some of your ideas I have done it using std::ranges and std::views. I made a slide-lambda so I could use ranges::equal on views.

#include <iostream>
#include <vector>
#include <algorithm>
#include <ranges>

using namespace std;
namespace ranges = std::ranges; 

auto findNeedlesInHaystack(auto & haystack, auto & needle){
    const int haystackSize = haystack.size();
    const int needleSize = needle.size();
    auto haySlide = [&haystack, needleSize](int start){
        return haystack 
            | views::drop(start)
            | views::take(needleSize);
    };

    return views::iota(0,haystackSize-needleSize+1)
        | views::filter([haySlide, &needle](auto i) {
                return ranges::equal(haySlide(i), needle);}
         );
}

int main()
{
    cout << endl << "Ranges version...\n";
    auto haystack = vector<int>{ 3,5,2,1,2,11,2,3,1,2,1,2,4,6,2,1,2 };
    auto needle = vector<int>{ 2,1,2 };

    cout << "Needles found at: ";
    for (auto n : findNeedlesInHaystack(haystack, needle)) {
        cout << n << ", ";
    }
}