Open spawn-guy opened 2 months ago
df.update
doesn't support CoW
Thanks for the report - can you provide a reproducible example on how CoW is not supported.
@rhshadrach here is some code and log
# select best source: heading
# HeadingTrue > HeadingMagnetic > HeadingAndDeclination (this is also magnetic) > TrackMadeGood
measurements_df["heading"] = measurements_df["gps_course_over_ground"]
# replace if other value is not nan
measurements_df["heading"].update(measurements_df["gps_heading"])
FutureWarning
_task.py:427: FutureWarning: A value is trying to be set on a copy of a DataFrame or Series through chained assignment using an inplace method.
The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.
For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.
measurements_df["heading"].update(measurements_df["gps_heading"])
Inconsistency with the warning:
df[col] = df[col].update(value)
that actually "returns" somethingThanks @spawn-guy, however your example is not reproducible because you did not provide measurements_df
. Can you provide a reproducible example?
@rhshadrach it took me some time to pick this up, but here is a small test. at first i thought it might be related to the mask that i use, but the FutureWarning
is thrown without it as well
import numpy as np
import pandas as pd
# test pandas warnings
df = pd.DataFrame(
{
"A": [np.nan, np.nan, np.nan, np.nan, np.nan, np.nan],
"B": [1, 1, 1, 1, 1, 1],
"C": [np.nan, 5, 6, np.nan, np.nan, np.nan],
"D": [0, 0, 2, 2, 0, 0],
}
)
# with mask
# df = df[df["D"] > 0]
df["E"] = df["A"]
df["E"].update(df["B"])
# df["E"].update(df["C"])
print(df)
results in
cli_python_311_upgrade_test.py:209: FutureWarning: A value is trying to be set on a copy of a DataFrame or Series through chained assignment using an inplace method.
The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.
For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.
df["E"].update(df["B"])
A B C D E
0 NaN 1 NaN 0 1.0
1 NaN 1 5.0 0 1.0
2 NaN 1 6.0 2 1.0
3 NaN 1 NaN 2 1.0
4 NaN 1 NaN 0 1.0
5 NaN 1 NaN 0 1.0
the FutureWarning
is thrown after df["E"].update(df["B"])
so, in current implementation, i don't see a way to fix this FutureWarning
for the reasons mentioned above
and if i do as the warning suggests - it will be a mistake
import numpy as np
import pandas as pd
# test pandas warnings
df = pd.DataFrame(
{
"A": [np.nan, np.nan, np.nan, np.nan, np.nan, np.nan],
"B": [1, 1, 1, 1, 1, 1],
"C": [np.nan, 5, 6, np.nan, np.nan, np.nan],
"D": [0, 0, 2, 2, 0, 0],
}
)
# with mask
# df = df[df["D"] > 0]
df["E"] = df["A"]
df["E"].update(df["B"])
# df["E"].update(df["C"])
print(df)
df["E"] = df["A"]
df["E"] = df["E"].update(df["C"])
print(df)
output
cli_python_311_upgrade_test.py:209: FutureWarning: A value is trying to be set on a copy of a DataFrame or Series through chained assignment using an inplace method.
The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.
For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.
df["E"].update(df["B"])
A B C D E
0 NaN 1 NaN 0 1.0
1 NaN 1 5.0 0 1.0
2 NaN 1 6.0 2 1.0
3 NaN 1 NaN 2 1.0
4 NaN 1 NaN 0 1.0
5 NaN 1 NaN 0 1.0
A B C D E
0 NaN 1 NaN 0 None
1 NaN 1 5.0 0 None
2 NaN 1 6.0 2 None
3 NaN 1 NaN 2 None
4 NaN 1 NaN 0 None
5 NaN 1 NaN 0 None
cli_python_311_upgrade_test.py:214: FutureWarning: A value is trying to be set on a copy of a DataFrame or Series through chained assignment using an inplace method.
The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.
For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.
df["E"] = df["E"].update(df["C"])
notice the all-None column E
Pandas version checks
main
hereLocation of the documentation
https://pandas.pydata.org/docs/dev/reference/api/pandas.Series.update.html#pandas.Series.update
Documentation problem
df.update
resembles howpython.dict.update
works, butdf.update
doesn't support CoWSuggested fix for documentation
remove FutureWarning for the
df.update
or create a (for example)
df.coalesce
method that will, actually,return
something. this shouldn't brake existing code