pmorissette / ffn

ffn - a financial function library for Python
pmorissette.github.io/ffn
MIT License
1.9k stars 284 forks source link

Incorrect missing data handling with pandas v1+ #152

Open aputkov opened 3 years ago

aputkov commented 3 years ago

Pandas v1+ introduced a new class (pd.NA) for missing data. ffn library v0.3.6 utilizes numpy isnan function to identify missing data. The problem is that np.isnan(pd.NA) returns pd.NA, which is ambiguous in boolean expressions. The effect of this issue is an exception raised when calc_stats functions computes the return table from monthly returns. Monthly returns necessarily is missing the value in the 1st row of DataFrame or Series. This value is passed to np.isnan function on line 320 in _calculate function of PerformanceStats class in core.py. With pandas v1+ the missing value passed to np.isnan is pd.NA. The expression np.isnan(mr[idx]) on line 320 of core.py is evaluated to pd.NA, and 'if np.isnan(mr[idx])' raises an exception. To make ffn library work with pandas v1+ np.isnan function on line 320 of core.py needs to be replaced with pd.isna. In addition and for the same reasons the expression 'if np.isnan(number)' on line 115 in utils.py needs to be replaced with 'if (np.isnan(number) | pd.isna(number))'