Pandas-2.0.0 changed default behavior of many methods on DataFrame that take argument numeric_only (for example sum, mean, etc). In the past, if numeric_only, the default value was None, and resulted into non-numeric columns being silently dropped. Now, the default value of numeric_only is True. This has two consequences:
a non-numeric columns that supports the aggregation function is now included in the result. For example, if the type of a column is str and aggregate operation is sum the result is all strings catenated. In the past this column was not included -- silently dropped.
a non-numeric column that does not support the aggregation function now generates a runtime error. In the past it was silently dropped.
In an attempt to facilitate the upgrade, this rewrites uses of agg("aggregate_func") calls into aggregate_func(). Later this will enable using Pyre to detect places that requre explicit numeric_only argument to be compatible with Pandas-2.0.0. This is not a semantic change, the effect should be equivalent with the previous spelling.
Summary: X-link: https://github.com/ctrl-labs/src2/pull/34567
Pandas-2.0.0 changed default behavior of many methods on
DataFrame
that take argumentnumeric_only
(for examplesum
,mean
, etc). In the past, ifnumeric_only
, the default value wasNone
, and resulted into non-numeric columns being silently dropped. Now, the default value ofnumeric_only
isTrue
. This has two consequences:str
and aggregate operation issum
the result is all strings catenated. In the past this column was not included -- silently dropped.In an attempt to facilitate the upgrade, this rewrites uses of
agg("aggregate_func")
calls intoaggregate_func()
. Later this will enable using Pyre to detect places that requre explicitnumeric_only
argument to be compatible with Pandas-2.0.0. This is not a semantic change, the effect should be equivalent with the previous spelling.This was generated with the following one-liner:
Inspected all changes.
Differential Revision: D62195922