ryanhaining / cppitertools

Implementation of python itertools and builtin iteration functions for C++17
https://twitter.com/cppitertools
BSD 2-Clause "Simplified" License
1.37k stars 115 forks source link

group_by a tuple key #22

Closed candychiu closed 8 years ago

candychiu commented 8 years ago

Hi, I am trying to group_by a tuple key:

struct A {
    int key1;
    int key2;
    int x;
    int y;
    int value;
};

void func() {
    using namespace std;

    // inputs
    std::vector<int> uniquex{ 1, 2 };
    std::vector<int> uniquey{ 1, 2 };
    vector<A> vec{
        { 10, 9, 1, 1, 5 },     { 10, 9, 1, 2, 6 },     
        { 10, 10, 1, 1, 1 },    { 10, 10, 1, 2, 2 },
        { 10, 9, 2, 1, 7 },     { 10, 9, 2, 2, 8 },
        { 10, 10, 2, 1, 3 },    { 10, 10, 2, 2, 4 },
    };
    auto keyFunc = [](const A& a) { return make_tuple(a.key1, a.key2); };

    for (auto&& gb : iter::groupby(vec, keyFunc)) {
        std::cout << "key(" << std::get<0>(gb.first) << "," << std::get<1>(gb.first) << "): ";
                for (auto&& s : gb.second) {
                        std::cout << s.value << "  ";
                }
                std::cout << '\n';
        }

Instead of printing 2 lines, this example prints four lines. Does the groupby intent to work with group data not next to each other?

ryanhaining commented 8 years ago

Note: Just like Python's itertools.groupby, this doesn't do any sorting. It just iterates through, making a new group each time there is a key change. Thus, if the group is unsorted, the same key may appear multiple times.

This isn't unique to cppitertools, this is how groupby functions everywhere I've seen it. What you are asking for would require sorting the sequence before hand. That means the keys would need to be less-than comparable, and the sequence would at least need to have a ForwardIterator. If you want to get a sorted view of the sequence (but not sort the sequence itself) and then group on that, look at iter::sorted