Some indicators return 0 instead of NaN when not enough datapoint in the dataset to calculate.

ronaimate95 commented 2 years ago

For example if I calculate EMA20 for a dataset which is 100 days long, the first 19 result will be NaN, as it should be. But some indicators like HT_TRENDMODE, CDL2CROWS and other CDL indicators will return 0s instead of NaNs. CDL2CROWS needs at least 13 datapoints as I see, and HT_TRENDMODE needs 64.

So if CDL2CROWS at time 't12' with large enough window size is -100, then: CDL2CROWS([t12]) = [0] CDL2CROWS([t11, t12]) = [0, 0] ... CDL2CROWS([t1, t2, ... , t12]) = [0, 0, ... 0] (12 times) CDL2CROWS([t0, t1, t2, ... , t12]) = [0, 0, ..., 0, -100] (12 zeros, and the last is -100)

and I think it should be in this case CDL2CROWS([t0, t1, t2, ... , t12]) = [NaN, NaN, ..., NaN, -100] (12 NaN's and the last is -100) like when I calculating moving averages...

To explain the situation how I found this: First I wanted to recreate the functionality that the Streaming API meant to do (so I can use talib in a realtime application with rolling window), when I realized that the Streaming API already exists. I still need to calculate the minimal required number of datapoints for the indicators (with a given parameter-set), but this 0 instead of NaN problem makes it harder.

mrjbq7 commented 2 years ago

We have been returning an integer array similar to the C functions, but this is a good point. Either we should expose the look back, return a smaller array of values, return a double array with NaNs, or something else.

Thoughts?

On Jul 19, 2022, at 2:26 PM, ronaimate95 @.***> wrote:

For example if I calculate EMA20 for a dataset which is 100 days long, the first 19 result will be NaN, as it should be. But some indicators like HT_TRENDMODE, CDL2CROWS and other CDL indicators will return 0s instead of NaNs. CDL2CROWS needs at least 13 datapoints as I see, and HT_TRENDMODE needs 64.

So if CDL2CROWS at time 't12' with large enough window size is -100, then: CDL2CROWS([t12]) = [0] CDL2CROWS([t11, t12]) = [0, 0] ... CDL2CROWS([t1, t2, ... , t12]) = [0, 0, ... 0] (12 times) CDL2CROWS([t0, t1, t2, ... , t12]) = [0, 0, ..., 0, -100] (12 zeros, and the last is -100)

and I think it should be in this case CDL2CROWS([t0, t1, t2, ... , t12]) = [NaN, NaN, ..., NaN, -100] (12 NaN's and the last is -100) like when I calculating moving averages...

To explain the situation how I found this: First I wanted to recreate the functionality that the Streaming API meant to do (so I can use talib in a realtime application with rolling window), when I realized that the Streaming API already exists. I still need to calculate the minimal required number of datapoints for the indicators (with a given parameter set), but this 0 instead of NaN problem makes it harder.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.

trufanov-nok commented 2 years ago

I still need to calculate the minimal required number of datapoints for the indicators (with a given parameter-set), but this 0 instead of NaN problem makes it harder.

Aren't lookback functions accessible from Python via wrapper?

trufanov-nok commented 2 years ago

It looks like not. Perhaps it'll be not so difficult to clone abstract.Function() from abstract.py to abstract.Lookback() or so that will do the same as abstract.Function() but for lookback functions?

trufanov-nok commented 2 years ago

Oh, there's a way to access a lookback values indeed:

from talib import abstract
func = abstract.Function('CDL2CROWS')
func.lookback
>> 12

func = abstract.Function('HT_TRENDMODE')
func.lookback
>> 63

ronaimate95 commented 2 years ago

Oh, there's a way to access a lookback values indeed:

from talib import abstract
func = abstract.Function('CDL2CROWS')
func.lookback
>> 12

func = abstract.Function('HT_TRENDMODE')
func.lookback
>> 63

I think this is exactly what I need, thanks!

ronaimate95 commented 2 years ago

We have been returning an integer array similar to the C functions, but this is a good point. Either we should expose the look back, return a smaller array of values, return a double array with NaNs, or something else. Thoughts?

I think the most consistent solution would be that if something can't be calculated by definition (like in this case not enough datapoint), then the result should be NaN. But if an integer array can't contain NaN that makes it more complicated, but in that case using double array outputs might worth it. Anyways it could be deceptive if the function returns a valid output on a "wrong" input.

ronaimate95 commented 2 years ago

I found that abstract.Function('MAMA ').lookback returns -1. Why is that? When I calculate it, the first 31 element in the two result array is nan, shouldn't it be 31?

trufanov-nok commented 2 years ago

Well, it seems to be a bug in wrapper.
The cdef int __ta_setOptInputParamReal(lib.TA_ParamHolder *holder, int idx, int value): must be
cdef int __ta_setOptInputParamReal(lib.TA_ParamHolder *holder, int idx, double value): in _abstract.pxi. I'll push a fix.

mrjbq7 commented 2 years ago

That’s a good idea, I can’t see right now but I think the abstract function might already have a lookback attribute on it.

On Tue, Jul 19, 2022 at 2:44 PM Alexander Trufanov @.***> wrote:

It looks like not. Perhaps it'll be not so difficult to clone abstract.Function() from abstract.py to abstract.Lookback() or so that will do the same as abstract.Function() but for lookback functions?

— Reply to this email directly, view it on GitHub https://github.com/mrjbq7/ta-lib/issues/532#issuecomment-1189007838, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAF5A5M2YMSAJXX5LL3SS3VU2PL5ANCNFSM537ZPAJQ . You are receiving this because you commented.Message ID: @.***>

TA-Lib / ta-lib-python

Some indicators return 0 instead of NaN when not enough datapoint in the dataset to calculate. #532