fslaborg / Deedle

Easy to use .NET library for data and time series manipulation and for scientific programming
http://fslab.org/Deedle/
BSD 2-Clause "Simplified" License
924 stars 197 forks source link

Arithmetic operator on frames shows unexpected result #432

Closed zyzhu closed 5 years ago

zyzhu commented 5 years ago

Example:

let df1 = frame [ "a" => Series.ofValues [ 1; 2]
                  "b" => Series.ofValues [ 3; 4] ]
let df2 = frame [ "b" => Series.ofValues [ 5; 6]
                  "c" => Series.ofValues [ 7; 8] ]
df1 + df2

I expect result shall be

     a         b  c         
0 -> <missing> 8  <missing> 
1 -> <missing> 10 <missing> 

But the current result is

     a b  c 
0 -> 1 8  7 
1 -> 2 10 8 

It's because of the implementation of Frame.zip left uncommon columns unmodified. https://github.com/fslaborg/Deedle/blob/master/src/Deedle/Frame.fs#L237

It is not intuitive as the operation shall happen only if both frame has value on that column. If only one of them has value, the output shall be left as missing showing the operator cannot be completed. That's the expected result from pandas.

df1 = pd.DataFrame([[1, 3], [2, 4]],
                    columns=['a', 'b'])
df2 = pd.DataFrame([[5, 7], [6, 8]],
                    columns=['b', 'c'])
df1 + df2

    a   b   c
0 NaN   8 NaN
1 NaN  10 NaN
zyzhu commented 5 years ago

Fixed in #434