Incorrect output when the input data contains nans

TA-Lib / ta-lib-python

Python wrapper for TA-Lib (http://ta-lib.org/).

http://ta-lib.github.io/ta-lib-python

Other

9.68k stars 1.76k forks source link

Incorrect output when the input data contains nans #376

Open Rainfall2013 opened 3 years ago

Rainfall2013 commented 3 years ago

I have a 2-D array which contains lots of nans. Each column has 244 not nan float number When calculating the MA of each column, the function gives the correct result np.sum(~np.isnan(talib.MA(b[:,0],5))) Out[106]: 240 np.sum(~np.isnan(talib.MA(b[:,1],5))) Out[107]: 240 However,when flatten the 2D array into a 1D array, the MA function's output only has 240 not nan data. After checking the result I found it lost the result of the second column. np.sum(~np.isnan(talib.MA(b.flatten(order = 'F'),5))) Out[108]: 240 Could you please help me to fix this problem, besides, does the talib function supports using 2d arrays directly?

Acutually, I found that talib wont calculate the value after it the nans except in the begining. eg. f = np.array([nan, 1, 2, 3, 4, 5, 6, nan, nan, 3, 4, 5, 6, 7]) talib.MA(f,2) will have a result like array([nan, nan, 1.5, 2.5, 3.5, 4.5, 5.5, nan, nan, nan, nan, nan, nan,nan]) Is there a solution to this problem?

mrjbq7 commented 3 years ago

The solution is for you to fill in the nans however you want before calling TA-Lib.

See, for example, how pandas.DataFrame.fillna has a few different algorithms:

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.fillna.html

I don't think there's a reasonable default for TA-Lib to do here...

Rainfall2013 commented 3 years ago

The solution is for you to fill in the nans however you want before calling TA-Lib.

See, for example, how pandas.DataFrame.fillna has a few different algorithms:

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.fillna.html

I don't think there's a reasonable default for TA-Lib to do here...

If I fill the nans with 0s, it will result in some other issues like underweight the MA values after the 0s. So, there are no methods for TA-Lib indicators to handle nans directly?

mrjbq7 commented 3 years ago

You can fill them by repeating the previous value, averaging between real values, or some other technique.

And then call TA-Lib.

I wouldn’t presume to know what method is relevant for your data set.

The fillna method documentation provides some good suggestions on techniques.

On Jan 6, 2021, at 6:29 PM, Rainfall2013 notifications@github.com wrote:

The solution is for you to fill in the nans however you want before calling TA-Lib.

See, for example, how pandas.DataFrame.fillna has a few different algorithms:

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.fillna.html

I don't think there's a reasonable default for TA-Lib to do here...

If I fill the nans with 0s, it will result in some other issues like underweight the MA values after the 0s. So, there are no methods for TA-Lib indicators to handle nans directly?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

Rainfall2013 commented 3 years ago

Thank you very much for the reply. Maybe I didn't clarify what I really want. In fact, the nans are used to separate the datas from different datasets, so they should be retained. I hope the TA-Lib can handle the nans as follows: f = np.array([nan, 1, 2, 3, 4, 5, 6, nan, nan, 3, 4, 5, 6, 7]) talib.MA(f,2) should return an array: array([nan, nan, 1.5, 2.5, 3.5, 4.5, 5.5, nan, nan, nan, 3.5, 4.5, 5.5, 6.5]) instead of array([nan, nan, 1.5, 2.5, 3.5, 4.5, 5.5, nan, nan, nan, nan, nan, nan,nan]) Whether this can be acieved?