k06a / boolinq

Simplest C++ header-only LINQ template library
MIT License
628 stars 79 forks source link

about groupBy #39

Closed nqf closed 4 years ago

nqf commented 4 years ago
std::vector<Man> src = {
        {"Kevin", 14}, {"Kevin", 24}, {"Kevin", 34}, {"Kevin", 44},
        {"Anton", 18},
        {"Terra", 20}, {"Terra", 21},
                {"Layer", 15},
                {"Agata", 17}, 
    };

    auto gr = boolinq::from(src1)
                  .groupBy([](Man a) {
                      return a.name;
                  })
                  .select([](auto p) {
                      return p.first + " => " + std::to_string(p.second.sum([](Man man) {
                                 return man.age;
                             }));
                  }).toStdSet();

    for (auto& r : gr) {
        std::cout << r << std::endl;
    }

output:
 Kevin => 116
nqf commented 4 years ago

I expect it to divide the data into five groups (Kevin,Anton,Terra,Layer,Agata), But it is not well grouped, Is my understanding wrong?

k06a commented 4 years ago

@nqf it seems you found a solution?

nqf commented 4 years ago

I found another issue, https://github.com/k06a/boolinq/issues/32 , But it doesn't seem to work very well, it only output a information(Kevin ), Please see the code above

k06a commented 4 years ago

@nqf I believe issue is with using std::set for struct with undefined operator < and operator ==.

nqf commented 4 years ago
#include "boolinq.h"

#include <unordered_map>
#include <vector>
#include <iostream>

int main() {
    struct Man {
        std::string name;
        int age;

        bool operator==(const Man& rhs) const {
            return (name == rhs.name) && (age == rhs.age);
        }

        bool operator<(const Man& t1) const {
            return (age < t1.age);
        }
    };

    std::vector<Man> src1 = {
        {"Kevin", 14}, {"Kevin", 24}, {"Kevin", 34}, {"Kevin", 44}, {"Anton", 18},
        {"Agata", 17}, {"Terra", 20}, {"Terra", 21}, {"Layer", 15},
    };

    auto gr = boolinq::from(src1)
                  .groupBy([](Man a) {
                      return a.name;
                  })
                  .select([](auto p) {
                      return p.first + " => " + std::to_string(p.second.sum([](Man man) {
                                 return man.age;
                             }));
                  })
                  .toStdVector();

    std::cout << gr.size() << std::endl;
    for (auto& r : gr) {
        std::cout << r << std::endl;
    }

    return 0;
}
➜  boolinq git:(master) ✗ ./a.out                 
1
Kevin => 116

This is a complete example, toStdVector still only outputs Kevin, Seems to be related to the internal order of SRC1 @k06a

ghost commented 4 years ago

Hi, @k06a !

I believe there is an issue with groupBy code:

                    _TKey key = apply(linq.next());
                    if (set.insert(key).second) {
                        return std::make_pair(key, linqCopy.where([apply, key](T v){
                            return apply(v) == key;
                        }));
                    }
                    throw LinqEndException();

Perhaps there is no need to throw an exception here, but an infinite loop could be used instead (it seems to work, but I haven't tested it thoroughly):

                    while(true) {
                    _TKey key = apply(linq.next());
                    if (set.insert(key).second) {
                        return std::make_pair(key, linqCopy.where([apply, key](T v){
                            return apply(v) == key;
                        }));
                    }
                  }

Here is a modified test to reproduce the issue:

int arr[] = {0,1,3,2,4,5,6,7,8,9};

    int ans_0[] = {0,3,6,9};
    int ans_1[] = {1,4,7};
    int ans_2[] = {2,5,8};

    auto dst = from(arr).groupBy([](int a){return a % 3;});

    EXPECT_EQ(0, dst.elementAt(0).first);
    EXPECT_EQ(1, dst.elementAt(1).first);
    EXPECT_EQ(2, dst.elementAt(2).first);

I have swapped 2 and 3 in the input data. Now we have only 2 of 3 groups returned, because values are 0, 1 and 0 again. The exception is thrown because 0 is already in the set. The 3rd group is never visited.

Thanks!

k06a commented 4 years ago

@87alex87 sorry for the late reply, you're absolutely right!

k06a commented 4 years ago

@nqf fix merged in master!