pandas-dev / pandas

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
https://pandas.pydata.org
BSD 3-Clause "New" or "Revised" License
42.75k stars 17.62k forks source link

DEPR/API: combine #53463

Open jbrockmendel opened 1 year ago

jbrockmendel commented 1 year ago
ser = pd.Series([1, 2, None], dtype="int64[pyarrow]")
ser2 = pd.Series([None, 2, 1], dtype="int64[pyarrow]")
df = ser.to_frame()
df2 = ser2.to_frame()

>>> ser.combine(ser2, operator.add, fill_value=-1)
0    <NA>
1       4
2    <NA>
dtype: int64[pyarrow]
>>> df.combine(df2, operator.add, fill_value=-1)
   0
0  0
1  4
2  0

These behave differently for Series and DataFrame in a way that I find not-at-all intuitive. We should either align or deprecate these.

tpackard1 commented 1 year ago

Hi I'd like to take a stab at this issue. But is there a need to discuss whether the they should be aligned or deprecated? I'm going to go ahead and do a take but if discussion is needed before a PR is submitted please let me know.

tpackard1 commented 1 year ago

take

lithomas1 commented 1 year ago

@tpackard1 This needs some discussion on whether the entire method should be deprecated or the behaviors aligned.

I'd recommend looking at some other issues for now.

jreback commented 1 year ago

+1 to deprecate / remove.