TA-Lib / ta-lib-python

Python wrapper for TA-Lib (http://ta-lib.org/).
http://ta-lib.github.io/ta-lib-python
Other
9.49k stars 1.74k forks source link

NaN issue in ta-lib #492

Closed hossain93 closed 7 months ago

hossain93 commented 2 years ago

hi every body I start a question on https://stackoverflow.com/questions/70712489/why-i-take-this-plot-with-matplotlib-pyplot-when-add-date-too-x-axis about NaN problem in TA-lib. can you help me for this problem please copy link and paste in search box:)

mrjbq7 commented 2 years ago

I'm unsure what the problem is with NaN in TA-Lib.

If your data includes NaN, then it will cause the TA-Lib library to produce NaN.

hossain93 commented 2 years ago

My data dont have any NaN

mrjbq7 commented 2 years ago

What is the issue that you are having?

On Jan 16, 2022, at 7:07 PM, hossain93 @.***> wrote:

 My data dont have any NaN

— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you commented.

hossain93 commented 2 years ago

What is the issue that you are having? On Jan 16, 2022, at 7:07 PM, hossain93 @.***> wrote:  My data dont have any NaN — Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you commented.

when i use indicators i get some NaN data in the first my data frame this problem occur for data that not exist but data exist. i want calculate indicators from the first my data but i think ta-lib calculate indicators from -1 cell that this cell not exist and i get nan for the first my data frame. for example in EMA20 indicator when I calculate it i take 19 NaN in my first dataframe because ta-lib start calculate from -1 cell that it's not exist text3.txt

i use these indicators test['avg'] = ta.ADX(test['high'],test['low'], test['close'], timeperiod=20)

test['EMA20'] = ta.EMA(test['close'], timeperiod=20) test['EMA50'] = ta.EMA(test['close'], timeperiod=50)

test['SMA20'] = ta.SMA(test['close'], timeperiod=20) test['SMA50'] = ta.SMA(test['close'], timeperiod=50)

test['ADX'] = ta.ADX(test['high'], test['low'], test['close'], timeperiod=14)

macd, macdsignal, macdhist = ta.MACD(test['close'], fastperiod=12, slowperiod=26, signalperiod=9)

test['RSI'] = ta.RSI(test['close'], timeperiod=14)

upper, mid, lower = ta.BBANDS(test['close'], nbdevup=2, nbdevdn=2, timeperiod=20)

when i want clculate EMA200, it was difficult because i don't take 199 first data :(

can you help me

excuse me for my weak english

trufanov-nok commented 2 years ago

EMA20 is a 20-days average and can't be calculated for dataset that contains only 1 day, or only 2 days etc.. till your data is 20+ days long. In this case it will return a 19 NaN values and the first non NaN for day 20. It's not looking for a non existing cells - it's intentionally return NaNs for this case. If you have a 365 days data and you start EMA20 calculation from its middle (so -1, -2 cells from middle are not NaN) - you'll still get a 19 NaNs at beginning of result. Consider these initial NaN values as "Not enough data to calculate" values.
If I recall right, it's a python wrapper who inserts these NaNs for simplicity. The real C library under the hood is just return a result array which is 19 values shorter than the input array and it return a day number (20 for EMA20) where this result is actually starting in comparison to the input.

hossain93 commented 2 years ago

EMA20 is a 20-days average and can't be calculated for dataset that contains only 1 day, or only 2 days etc.. till your data is 20+ days long. In this case it will return a 19 NaN values and the first non NaN for day 20. It's not looking for a non existing cells - it's intentionally return NaNs for this case. If you have a 365 days data and you start EMA20 calculation from its middle (so -1, -2 cells from middle are not NaN) - you'll still get a 19 NaNs at beginning of result. Consider these initial NaN values as "Not enough data to calculate" values. If I recall right, it's a python wrapper who inserts these NaNs for simplicity. The real C library under the hood is just return a result array which is 19 values shorter than the input array and it return a day number (20 for EMA20) where this result is actually starting in comparison to the input.

I mistake and I understand Thankful

dss010101 commented 1 year ago

Curious...would this happen if the first value in the series is NaN, but all others are valid float values? I ask because i experienced a similar issue after moving from windows to linux (containerized) environment. Had no issues in windows..but linux, a simple ta_lib.SMA produced all nan values. After seeing this post and dropping that first NaN row, it worked. So wondering what was different about windows vs linux environment?

mrjbq7 commented 1 year ago

I am not aware of a difference between windows and linux nan handling. On Jun 15, 2023, at 10:40 AM, dss010101 @.***> wrote: Curious...would this happen if the first value in the series is NaN, but all others are valid float values? I ask because i experienced a similar issue after moving from windows to linux (containerized) environment. Had no issues in windows..but linux, a simple ta_lib.SMA produced all nan values. After seeing this post and dropping that first NaN row, it worked. So wondering what was different about windows vs linux environment?

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>

dss010101 commented 1 year ago

Thx. Then it has to be my data has changed

mrjbq7 commented 1 year ago

If you run a quick test on both systems do you see a difference?On Jun 15, 2023, at 10:59 AM, dss010101 @.***> wrote: Thx. Then it has to be my data has changed

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>

mrjbq7 commented 7 months ago

Closing old issue.