Open Arshaku opened 2 years ago
This is related to https://github.com/EntilZha/PyFunctional/issues/158, where the root issue is that for convenience I originally decided to wrap a little to aggressively. I think the fix would be in two steps: (1) add a configurable option to not wrap elements with default to wrap (2) bump version to 2.X and make default to not wrap to avoid breakage. I'd be open to a PR that does this.
Similar issue duing reduce
with the lastest master.
Expected:
from functools import reduce
reduce(lambda x,y: x.add(y), [df,df])
Out[1]:
A B
0 24.0 14.0
1 8.0 4.0
In fact:
seq([df,df]).reduce(lambda x,y: x.add(y))
Out[2]:
[array([24., 14.]), array([8., 4.])]
to_pandas
can give some help but the column names will be missing:
seq([df,df]).reduce(lambda x,y: x.add(y)).to_pandas()
Out[3]:
0 1
0 24.0 14.0
1 8.0 4.0
Hope it can be fixed
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Hi, any chance this issue will be fixed soon?
I don't have the bandwidth to contribute fixes myself right now, I'd welcome/review pull requests that fix it roughly how previously outlined.
@EntilZha Any reason why __getitem__()
also wraps?
The reason I originally did it this way is the same, I wanted it to be easy to do something like:
In [1]: from functional import seq
In [2]: seq.range(10).grouped(3)[0].map(lambda x: x * 2)
Out[2]: [0, 2, 4]
In [3]: type(seq.range(10).grouped(3)[0].map(lambda x: x * 2))
Out[3]: functional.pipeline.Sequence
As I mentioned in my prior comments, in retrospect, this has three issues: (1) there is no way to configure the behavior, namely disable it, (2) even if it were configurable, I think its probably incorrect to make the default in most cases to wrap, it probably should do that more sparingly, and (3) changing this is a breaking change, likely requiring a move to 2.x.
I'd welcome/review PRs that would fix this, but don't have the time to do it myself right now. If you are interested, I can outline how I'd do this in a little more detail.
Thanks!
@reklanirs
In fact:
seq([df,df]).reduce(lambda x,y: x.add(y)) Out[2]: [array([24., 14.]), array([8., 4.])]
This is a different problem caused by the fact that PyFunctional has a special handling for DataFrame
- for some reason it extracts values
from it.
to_pandas
can give some help but the column names will be missing:seq([df,df]).reduce(lambda x,y: x.add(y)).to_pandas() Out[3]: 0 1 0 24.0 14.0 1 8.0 4.0
How about this:
>>> seq(reduce(lambda x,y: x.add(y), [df,df])).to_pandas(df.columns)
col1 col2
0 2 8
1 4 10
2 6 12
this code prints: "<class 'functional.pipeline.Sequence'>" but the expected output is "<class 'pandas.core.frame.DataFrame'>"