Closed haliaga closed 3 years ago
Thank you for the feedback. I will try to replicate this issue when I get a chance.
@haliaga Apologies for the long delay. Just investigating this issue now
@haliaga
First of all, wanted to say thank you for submitting this issue. Investigation into this issue identified a bug in the group_join
method that I failed to notice earlier and I incorporated your code as a unit test into this project.
The issue with your code is in your result_func lambda
function used in the group_join
method. The end result of the group_join
operation is a tuple with the format of (element, Grouping
), where element is an element from your outer collection of the group_join
and Grouping
is an Enumerable
that contains an iterator of inner collection elements that match the inner_key(inner_element) == outer_key(outer_element) predicate. In your case, your code should look like this unit test I recently added:
e1 = Enumerable([{"value": 1}, {"value": 2}, {"value": 3}, {"value": 0}])
e2 = Enumerable([1, 2, 3, 1, 2, 1])
res = e1.group_join(
e2,
outer_key=lambda x: x["value"],
inner_key=lambda y: y,
result_func=lambda r: (r[0], r[1].to_list()),
)
self.assertListEqual(
[
({"value": 1}, [1, 1, 1]),
({"value": 2}, [2, 2]),
({"value": 3}, [3]),
],
res.to_list(),
)
As you can see, the result_func in the above code explictly calls to_list()
on the Grouping
instance in the tuple fed into the result_func
lambda function. This is necessary when using the result_func
due to how the groupby iterator is implemented in the python itertools module. To quote:
The returned group is itself an iterator that shares the underlying iterable with groupby(). Because the source is shared, when the groupby() object is advanced, the previous group is no longer visible. So, if that data is needed later, it should be stored as a list
Another example of this is available on the documentation.
The code below prints the correct result, when a breakpoint is inserted on "func": ({'value': 1}, {'key': "{'id': 1}", 'enumerable': '[1, 1, 1]'}) ({'value': 2}, {'key': "{'id': 2}", 'enumerable': '[2, 2]'}) ({'value': 3}, {'key': "{'id': 3}", 'enumerable': '[3]'}) ({'value': 0}, {'key': "{'id': 0}", 'enumerable': '[]'}) If a breakpoint is not planted or better no func(r) but r is used, the "enumerable" comes empty.
code:
def func(res): return res
def _010_group_join(): e1 = Enumerable([ {'value': 1}, {'value': 2}, {'value': 3}, {'value': 0} ]) e2 = Enumerable([1, 2, 3,1,2,1]) res = e1.group_join(e2, outer_key=lambda x: x['value'], inner_key=lambda y: y, result_func=lambda r: func(r)).to_list() for e in res: print(e) print('end')