markrogoyski / itertools-php

PHP Iteration Tools Library
MIT License
140 stars 11 forks source link

Methods `Single::groupBy()`, `Stream::groupBy()` improved #35

Closed Smoren closed 1 year ago

Smoren commented 1 year ago

Hi @markrogoyski,

Solving a specific task, I came across the fact that I lack the functionality of the method Single::groupBy().

So I've improved Single::groupBy() and Stream::groupBy() methods with $groupKeyFunction param and multiple item groups support.

Usage example:

$input = [
    ['name' => 'Sam', 'interests' => ['programming', 'books', 'slacking', 'music']],
    ['name' => 'Laura', 'interests' => ['math', 'fantasy', 'wine', 'music']],
    ['name' => 'Alice', 'interests' => ['music', 'programming', 'fantasy']],
    ['name' => 'Anonymous', 'interests' => []],
];

$result = Single::groupBy(
    $input,
    fn (array $profile) => $profile['interests'],
    fn (array $profile) => $profile['name'],
);

/*
[
    'programming' => [
        'Sam' => ['name' => 'Sam', 'interests' => ['programming', 'books', 'slacking', 'music']],
        'Alice' => ['name' => 'Alice', 'interests' => ['music', 'programming', 'fantasy']],
    ],
    'books' => [
        'Sam' => ['name' => 'Sam', 'interests' => ['programming', 'books', 'slacking', 'music']],
    ],
    'slacking' => [
        'Sam' => ['name' => 'Sam', 'interests' => ['programming', 'books', 'slacking', 'music']],
    ],
    'music' => [
        'Sam' => ['name' => 'Sam', 'interests' => ['programming', 'books', 'slacking', 'music']],
        'Laura' => ['name' => 'Laura', 'interests' => ['math', 'fantasy', 'wine', 'music']],
        'Alice' => ['name' => 'Alice', 'interests' => ['music', 'programming', 'fantasy']],
    ],
    'math' => [
        'Laura' => ['name' => 'Laura', 'interests' => ['math', 'fantasy', 'wine', 'music']],
    ],
    'fantasy' => [
        'Laura' => ['name' => 'Laura', 'interests' => ['math', 'fantasy', 'wine', 'music']],
        'Alice' => ['name' => 'Alice', 'interests' => ['music', 'programming', 'fantasy']],
    ],
    'wine' => [
        'Laura' => ['name' => 'Laura', 'interests' => ['math', 'fantasy', 'wine', 'music']],
    ],
]
*/
coveralls commented 1 year ago

Pull Request Test Coverage Report for Build 4171755685

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details


Totals Coverage Status
Change from base Build 4171111633: 0.002%
Covered Lines: 687
Relevant Lines: 688

💛 - Coveralls
markrogoyski commented 1 year ago

Hi @Smoren,

Thank you for the PR to improve the groupBy functionality.

If I understand it correctly, there are two changes here:

  1. Allow the original grouping function to further separate into groups if the result of the grouping function is a list.
  2. A new parameter function to index the values within each group.

Does that sound correct or would you describe it differently? Thanks, Mark

Smoren commented 1 year ago

Hi @markrogoyski,

Yes, completely true!

Smoren commented 1 year ago

I've refactored Single::groupBy() a little bit: using of Stream::of($group)->toArray() replaced by using Transform::toArray($group).

Smoren commented 1 year ago

This branch is rebased from develop.

Smoren commented 1 year ago

Hi @markrogoyski,

What's about this PR? I really need this functionality at my job. I will be glad if it will be included in the next release.

markrogoyski commented 1 year ago

Hi @Smoren,

Sorry for the delay in reviewing this.

I think there may be a bug if there are duplicated candidate group items. For example, try this unit test case for testArray:

[
    [
        ['name' => 'Sam', 'interests' => ['programming', 'books', 'slacking', 'music']],
        ['name' => 'Laura', 'interests' => ['math', 'fantasy', 'wine', 'music']],
        ['name' => 'Alice', 'interests' => ['music', 'music', 'programming', 'fantasy']],  // Alice has "music" twice
        ['name' => 'Anonymous', 'interests' => []],
    ],
    fn ($x) => $x['interests'],
    null,
    [ // ???
    ],
],

I don't think it will deduplicate the results, and Alice will end up in the music group twice.

If you add $itemKeyFunction then I think it will overwrite that key and only show up once, but the interest will still show up twice.

You can argue it is perhaps bad data, but the resultant group should not end up with more participants than were originally input. I think you might want to try to dedupe this somehow.

Smoren commented 1 year ago

Hi @markrogoyski,

I've implemented prevention of duplicating items in groups and added new test cases.

Smoren commented 1 year ago

Thank you for merging this PR!

markrogoyski commented 1 year ago

Hi @Smoren,

I've released this and the other new functionality in the latest release v1.4.0. Thank you for all the PRs and implementing new features!

Mark

Smoren commented 1 year ago

Hi @markrogoyski, this is the great news! Thank you!

Smoren commented 1 year ago

Hi @markrogoyski, I think there is a formatting typo here in the release description.

Screenshot_2023-02-15-10-28-29-887-edit_com android chrome

markrogoyski commented 1 year ago

Thanks. I fixed it.