jwood000 / RcppAlgos

Tool for Solving Problems in Combinatorics and Computational Mathematics
GNU General Public License v2.0
44 stars 5 forks source link

ComboGroups for various group size seems missing some combination #53

Open tommylei501 opened 2 months ago

tommylei501 commented 2 months ago

Using ComboGroups for combination of 4 letter, group into 3 group with size (1, 1, 2).

The out put generate the results below.

comboGroups(letters[1:4],grpSizes = c(1, 1, 2)) Grp1 Grp2 Grp3 Grp3 [1,] "a" "b" "c" "d" [2,] "a" "c" "b" "d" [3,] "a" "d" "b" "c"

However, three more possible combination are missing. They are not the same combination with the above. Grp1 Grp2 Grp3 Grp3 [4,] "b" "c" "a" "d" [5,] "b" "d" "a" "b" [6,] "c" "d" "a" "c"

jwood000 commented 2 months ago

@tommylei501,

Thanks for the report.

If you'd like to know what was going on, here is a quick synapsis.

First off, groups of size 1 are tricky to deal with. When I originally developed this, to get around the complexities of dealing with groups of size 1, I noticed that n groups of size 1 are isomorphic to one group of size n. This is great, because I could simply group the groups of size 1 together and treat them as their own group of size n. The problem with this is exactly the example you posted... when we have an actual group of size n along with n groups of size 1. The core code is now seeing two groups of size n.

Fortunately, the fix was somewhat simple. We had to acknowledge that when we have groups of size 1 that they are not the "same" as any other groups. This comes into play with the helper class:

In GroupHelperClass.cpp we have the crucial method balance:

void GroupHelper::balance(
    std::vector<int> &z, int idx1, int curr_bnd, int I
) const {

    situate(z, idx1, curr_bnd + grp[i]);

    if (same[i] && z[lbound[i]] > z[lbound[i + 1L]]) {
.
.
.

Note the check for same[I]. Essentially, we enter the meat of this code if we are on a group that has a non-unique size.

And in the ComboGroupsTemplate.cpp file where same is created, we originally had:

    std::vector<bool> same(numGroups, false);

    for (int i = numGroups - 2; i >= 0; --i) {
        same[i] = vGrpSize[i] == vGrpSize[i + 1L];
    }

Also note that we have a boolean variable: OneGrp which indicates if we have groups of size 1.

When OneGrp = true, we know that it is actually distinct from any other group, so now we change the above code for populating the same vector to:

    std::vector<bool> same(numGroups, false);

    for (int i = numGroups - 2; i >= OneGrp; --i) {
        same[i] = vGrpSize[i] == vGrpSize[i + 1L];
    }

The above fix only helped our case when we are iterating... that is, getting results one by one. The algorithm for obtaining the nth result is a bit different. Nothing changed with the core algorithm, we just had to ensure that we were passing the original vector of sizes.

E.g. In the original code when we had grpSizes = c(1, 1, 2) we would pass the vector {2, 2} (grouping the groups of size 1), now we pass {1, 1, 2}.

This fix will be on CRAN shortly. I close this issue once it is merged into main.

Joseph

tommylei501 commented 2 months ago

@tommylei501,

Thanks for the report.

If you'd like to know what was going on, here is a quick synapsis.

First off, groups of size 1 are tricky to deal with. When I originally developed this, to get around the complexities of dealing with groups of size 1, I noticed that n groups of size 1 are isomorphic to one group of size n. This is great, because I could simply group the groups of size 1 together and treat them as their own group of size n. The problem with this is exactly the example you posted... when we have an actual group of size n along with n groups of size 1. The core code is now seeing two groups of size n.

Fortunately, the fix was somewhat simple. We had to acknowledge that when we have groups of size 1 that they are not the "same" as any other groups. This comes into play with the helper class:

In GroupHelperClass.cpp we have the crucial method balance:

void GroupHelper::balance(
    std::vector<int> &z, int idx1, int curr_bnd, int I
) const {

    situate(z, idx1, curr_bnd + grp[i]);

    if (same[i] && z[lbound[i]] > z[lbound[i + 1L]]) {
.
.
.

Note the check for same[I]. Essentially, we enter the meat of this code if we are on a group that has a non-unique size.

And in the ComboGroupsTemplate.cpp file where same is created, we originally had:

    std::vector<bool> same(numGroups, false);

    for (int i = numGroups - 2; i >= 0; --i) {
        same[i] = vGrpSize[i] == vGrpSize[i + 1L];
    }

Also note that we have a boolean variable: OneGrp which indicates if we have groups of size 1.

When OneGrp = true, we know that it is actually distinct from any other group, so now we change the above code for populating the same vector to:

    std::vector<bool> same(numGroups, false);

    for (int i = numGroups - 2; i >= OneGrp; --i) {
        same[i] = vGrpSize[i] == vGrpSize[i + 1L];
    }

The above fix only helped our case when we are iterating... that is, getting results one by one. The algorithm for obtaining the nth result is a bit different. Nothing changed with the core algorithm, we just had to ensure that we were passing the original vector of sizes.

E.g. In the original code when we had grpSizes = c(1, 1, 2) we would pass the vector {2, 2} (grouping the groups of size 1), now we pass {1, 1, 2}.

This fix will be on CRAN shortly. I close this issue once it is merged into main.

Joseph

Thanks for getting back and to fix this. This package is great, it helped me to solve a lot combination problem. Thanks for your efforts to develop this.