TA-Lib / ta-lib-python

Python wrapper for TA-Lib (http://ta-lib.org/).
http://ta-lib.github.io/ta-lib-python
Other
9.52k stars 1.74k forks source link

Differences between talib and hand writing code #196

Open TerrenceVarada opened 6 years ago

TerrenceVarada commented 6 years ago

Thanks for such a great tool. It works perfectly through my research. Now I wanna use this in spark. Since I can't import talib to the online environment, I have to rewrite the indicators through python. And I tried to compare your results with mine, I can always find some differences. Could you open your source code? Looking forward hearing from you, thanks again.

mrjbq7 commented 6 years ago

Good to hear. The source code is super open! First, this project is a Python wrapper around an underlying C library (http://ta-lib.org).

You can download the last release of source code (a decade ago or so) here:

http://prdownloads.sourceforge.net/ta-lib/ta-lib-0.4.0-src.tar.gz

You can also view code online (trunk contains latest code with a few unreleased commits it looks like):

https://sourceforge.net/p/ta-lib/code/HEAD/tree/trunk/ta-lib/

For example, the MOM momentum indicator you can see the source code here:

https://sourceforge.net/p/ta-lib/code/HEAD/tree/trunk/ta-lib/c/src/ta_func/ta_MOM.c

It's a little hard to read, since it supports multiple languages, and mostly also because of C performance and verbosity, but you get used to it.

And for complete-ness, this is how I call it from the Cython module:

https://github.com/mrjbq7/ta-lib/blob/master/talib/_func.pxi#L3996

TerrenceVarada commented 6 years ago

Thanks for your reply, it is very helpful. I tried to rewrite the sar into python, but the output didn’t match. Have you did some adjustments before put the data into the indicator calculator? Could you help check my code?

在 2018年4月11日,下午10:42,John Benediktsson notifications@github.com 写道:

Good to hear. The source code is super open! First, this project is a Python wrapper around an underlying C library (http://ta-lib.org http://ta-lib.org/).

You can download the last release of source code (a decade ago or so) here:

http://prdownloads.sourceforge.net/ta-lib/ta-lib-0.4.0-src.tar.gz http://prdownloads.sourceforge.net/ta-lib/ta-lib-0.4.0-src.tar.gz You can also view code online (trunk contains latest code with a few unreleased commits it looks like):

https://sourceforge.net/p/ta-lib/code/HEAD/tree/trunk/ta-lib/ https://sourceforge.net/p/ta-lib/code/HEAD/tree/trunk/ta-lib/ For example, the MOM momentum indicator you can see the source code here:

https://sourceforge.net/p/ta-lib/code/HEAD/tree/trunk/ta-lib/c/src/ta_func/ta_MOM.c https://sourceforge.net/p/ta-lib/code/HEAD/tree/trunk/ta-lib/c/src/ta_func/ta_MOM.c It's a little hard to read, since it supports multiple languages, and mostly also because of C performance and verbosity, but you get used to it.

And for complete-ness, this is how I call it from the Cython module:

https://github.com/mrjbq7/ta-lib/blob/master/talib/_func.pxi#L3996 https://github.com/mrjbq7/ta-lib/blob/master/talib/_func.pxi#L3996 — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/mrjbq7/ta-lib/issues/196#issuecomment-380477719, or mute the thread https://github.com/notifications/unsubscribe-auth/AS63dU9PzYNQHCUvdBP0mYO7rsZcv-Bmks5tnhZqgaJpZM4TPue4.

TerrenceVarada commented 6 years ago

In order to explain my question, The following is my data.

Date: 20170103, close:15.2, open:15.06, high:15.25,low:15.06 Date: 20170104, close:15.46, open:15.18, high:15.52,low:15.17 Date: 20170105, close:15.69, open:15.82, high:15.93,low:15.68

According to your code, on 20170104, isLong equals to 1, therefore sar equals to 15.06, which is the newLow = inLow[todayIdx-1]. Then the calculator moves to 20170105, on this day, since the previous isLong equals to 1, newHigh equals inHigh[todayIdx], which is 15.93, the newHigh is higher than the previous ep, which equals to 15.52 the code will adjust af and ep. As the result, the new sar is calculated by the new af and ep, with ep equals to newHigh and af equals to 0.04. And the result is 15.0948. But the result, which calculated by ta.SAR is 15.0692. I recalculated the number, it skip the newHigh > ep process. Could you please explain it.

The attachments are my dat, calculate result, and my code. In this data, ta-sar is calculated by ta.SAR and sar is calculated by me. Looking forward hearing from you. Thanks.

在 2018年4月11日,下午10:42,John Benediktsson notifications@github.com 写道:

Good to hear. The source code is super open! First, this project is a Python wrapper around an underlying C library (http://ta-lib.org http://ta-lib.org/).

You can download the last release of source code (a decade ago or so) here:

http://prdownloads.sourceforge.net/ta-lib/ta-lib-0.4.0-src.tar.gz http://prdownloads.sourceforge.net/ta-lib/ta-lib-0.4.0-src.tar.gz You can also view code online (trunk contains latest code with a few unreleased commits it looks like):

https://sourceforge.net/p/ta-lib/code/HEAD/tree/trunk/ta-lib/ https://sourceforge.net/p/ta-lib/code/HEAD/tree/trunk/ta-lib/ For example, the MOM momentum indicator you can see the source code here:

https://sourceforge.net/p/ta-lib/code/HEAD/tree/trunk/ta-lib/c/src/ta_func/ta_MOM.c https://sourceforge.net/p/ta-lib/code/HEAD/tree/trunk/ta-lib/c/src/ta_func/ta_MOM.c It's a little hard to read, since it supports multiple languages, and mostly also because of C performance and verbosity, but you get used to it.

And for complete-ness, this is how I call it from the Cython module:

https://github.com/mrjbq7/ta-lib/blob/master/talib/_func.pxi#L3996 https://github.com/mrjbq7/ta-lib/blob/master/talib/_func.pxi#L3996 — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/mrjbq7/ta-lib/issues/196#issuecomment-380477719, or mute the thread https://github.com/notifications/unsubscribe-auth/AS63dU9PzYNQHCUvdBP0mYO7rsZcv-Bmks5tnhZqgaJpZM4TPue4.

mrjbq7 commented 6 years ago

Your attachments didn't come through. Maybe you can post the code and files somewhere as a http://gist.github.com or something.

But reading through the source code for the TA-Lib version of SAR:

https://sourceforge.net/p/ta-lib/code/HEAD/tree/trunk/ta-lib/c/src/ta_func/ta_SAR.c#l240

I see this comment, maybe it's helpful:

   /* Implementation of the SAR has been a little bit open to interpretation
    * since Wilder (the original author) did not define a precise algorithm
    * on how to bootstrap the algorithm. Take any existing software application
    * and you will see slight variation on how the algorithm was adapted.
    *
    * What is the initial trade direction? Long or short?
    * ===================================================
    * The interpretation of what should be the initial SAR values is
    * open to interpretation, particularly since the caller to the function
    * does not specify the initial direction of the trade.
    *
    * In TA-Lib, the following logic is used:
    *  - Calculate +DM and -DM between the first and
    *    second bar. The highest directional indication will
    *    indicate the assumed direction of the trade for the second
    *    price bar. 
    *  - In the case of a tie between +DM and -DM,
    *    the direction is LONG by default.
    *
    * What is the initial "extreme point" and thus SAR?
    * =================================================
    * The following shows how different people took different approach:
    *  - Metastock use the first price bar high/low depending of
    *    the direction. No SAR is calculated for the first price
    *    bar.
    *  - Tradestation use the closing price of the second bar. No
    *    SAR are calculated for the first price bar.
    *  - Wilder (the original author) use the SIP from the
    *    previous trade (cannot be implement here since the
    *    direction and length of the previous trade is unknonw).
    *  - The Magazine TASC seems to follow Wilder approach which
    *    is not practical here.
    *
    * TA-Lib "consume" the first price bar and use its high/low as the
    * initial SAR of the second price bar. I found that approach to be
    * the closest to Wilders idea of having the first entry day use
    * the previous extreme point, except that here the extreme point is
    * derived solely from the first price bar. I found the same approach
    * to be used by Metastock.
    */
TerrenceVarada commented 6 years ago

The following is my code, and I think the problem may be caused by line 403 to 428 from your ta_SAR.c file.

import pandas as pd import numpy as np import talib as ta

def initial_sar(opens, close, high, low, i, n = 1): tmp_close = close[i - n:i + 1] tmp_open = opens[i - n:i + 1] up_down = sum(pd.Series(tmp_close) - pd.Series(tmp_open))

newhigh = high[i - 1]
newlow = low[i - 1]

if up_down >= 0:
    islong = 1
    ep = high[i]
    sar = newlow
else:
    islong = 0
    ep = low[i]
    sar = newhigh

return islong, sar, ep, newlow, newhigh

def next_sar(high, low, islong, sar, ep, prevLow, prevHigh, i, init_af, max_af, af):

newLow = low[i]
newHigh = high[i]

if islong == 1:
    if newLow <= sar:
        islong = 0
        sar = ep

        if sar < prevHigh:
            sar = prevHigh
        elif sar < newHigh:
            sar = newHigh

        # output sar

        af = init_af
        ep = newLow

        sar = sar + af * (ep - sar)

        if sar < prevHigh:
            sar = prevHigh
        if sar < newHigh:
            sar = newHigh
    else:

        # output sar

        if newHigh > ep:
            ep = newHigh
            af += init_af
            af = min(af, max_af)

        sar = sar + af * (ep - sar)

        if sar > prevLow:
            sar = prevLow
        if sar > newLow:
            sar = newLow
else:
    if newHigh >= sar:
        islong = 1
        sar = ep

        if sar > prevLow:
            sar = prevLow
        if sar > newLow:
            sar = newLow
        # output sar

        af = init_af
        ep = newHigh

        sar = sar + af * (ep - sar)

        if sar > prevLow:
            sar = prevLow
        if sar > newLow:
            sar = newLow
    else:
        # output sar

        if newLow < ep:
            ep = newLow
            af += init_af
            af = min(af, max_af)

        sar = sar + af * (ep - sar)

        if sar < prevHigh:
            sar = prevHigh
        if sar < newHigh:
            sar = newHigh

return islong, sar, ep, newLow, newHigh, af

def sar_copy_talib(df, init_af = 0.02, max_af = 0.2):

prepare data

opens = df['open'].values
close = df['close'].values
high = df['high'].values
low = df['low'].values

# save data
l_islong = [np.nan] * len(df.index)
l_sar = [np.nan] * len(df.index)
l_ep = [np.nan] * len(df.index)
l_newlow = [np.nan] * len(df.index)
l_newhigh = [np.nan] * len(df.index)
l_af = [np.nan] * len(df.index)

for i in range(1, len(df.index)):
    if i == 1:
        islong, sar, ep, newlow, newhigh = initial_sar(opens, close, high, low, i)

        l_islong[i] = islong
        l_sar[i] = sar
        l_ep[i] = ep
        l_newlow[i] = newlow
        l_newhigh[i] = newhigh
        l_af[i] = init_af

    elif i > 1:
        islong = l_islong[i - 1]
        sar = l_sar[i - 1]
        ep = l_ep[i - 1]
        newlow = l_newlow[i - 1]
        newhigh = l_newhigh[i - 1]
        af = l_af[i - 1]

        islong, sar, ep, newlow, newhigh, af = next_sar(high, low, islong, sar, ep, newlow, newhigh,
                                                        i, init_af, max_af, af)

        l_islong[i] = islong
        l_sar[i] = sar
        l_ep[i] = ep
        l_newlow[i] = newlow
        l_newhigh[i] = newhigh
        l_af[i] = af

return l_islong, l_sar, l_ep, l_newlow, l_newhigh

df = pd.read_csv('indicator_test.csv', index_col = ['date']) print df.head()

df['islong'], df['sar'], df['ep'], df['newlow'], df['newhigh'] = sar_copy_talib(df)

df['ta-sar'] = ta.SAR(np.array(df['high']), np.array(df['low']), acceleration=0.02, maximum=0.2)

print df df.to_csv('copy_code.csv', index_label = ['date'])

在 2018年4月12日,22:53,John Benediktsson notifications@github.com 写道:

Your attachments didn't come through. Maybe you can post the code and files somewhere as a http://gist.github.com http://gist.github.com/ or something.

But reading through the source code for the TA-Lib version of SAR:

https://sourceforge.net/p/ta-lib/code/HEAD/tree/trunk/ta-lib/c/src/ta_func/ta_SAR.c#l240 https://sourceforge.net/p/ta-lib/code/HEAD/tree/trunk/ta-lib/c/src/ta_func/ta_SAR.c#l240 I see this comment, maybe it's helpful:

/* Implementation of the SAR has been a little bit open to interpretation

  • since Wilder (the original author) did not define a precise algorithm
  • on how to bootstrap the algorithm. Take any existing software application
  • and you will see slight variation on how the algorithm was adapted.
  • What is the initial trade direction? Long or short?
  • ===================================================
  • The interpretation of what should be the initial SAR values is
  • open to interpretation, particularly since the caller to the function
  • does not specify the initial direction of the trade.
  • In TA-Lib, the following logic is used:
    • Calculate +DM and -DM between the first and
  • second bar. The highest directional indication will
  • indicate the assumed direction of the trade for the second
  • price bar.
    • In the case of a tie between +DM and -DM,
  • the direction is LONG by default.
  • What is the initial "extreme point" and thus SAR?
  • =================================================
  • The following shows how different people took different approach:
    • Metastock use the first price bar high/low depending of
  • the direction. No SAR is calculated for the first price
  • bar.
    • Tradestation use the closing price of the second bar. No
  • SAR are calculated for the first price bar.
    • Wilder (the original author) use the SIP from the
  • previous trade (cannot be implement here since the
  • direction and length of the previous trade is unknonw).
    • The Magazine TASC seems to follow Wilder approach which
  • is not practical here.
  • TA-Lib "consume" the first price bar and use its high/low as the
  • initial SAR of the second price bar. I found that approach to be
  • the closest to Wilders idea of having the first entry day use
  • the previous extreme point, except that here the extreme point is
  • derived solely from the first price bar. I found the same approach
  • to be used by Metastock. */ — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/mrjbq7/ta-lib/issues/196#issuecomment-380834187, or mute the thread https://github.com/notifications/unsubscribe-auth/AS63dQVRzaQ9_OkdT1492N2hOYKCtamNks5tn2qHgaJpZM4TPue4.
mrjbq7 commented 6 years ago

Interesting. Not sure, without spending more time on it which I don't have right now. The C code is the best place to see what differences might exist.

However, it's not my C code. I just support this Python wrapper for that TA-Lib underlying C library, and do my best to answer questions, etc. But I can't help you with the thinking of the original author or the behavior they intended.

TerrenceVarada commented 6 years ago

Could you please take some time and take a look at it. As what I have test, the difference is caused by line 403 to 428 from the ta_SAR.c file. Sorry for the previous misunderstanding.

Have a good weekend, Chen

在 2018年4月13日,03:39,John Benediktsson <notifications@github.com mailto:notifications@github.com> 写道:

Interesting. Not sure, without spending more time on it which I don't have right now. The C code is the best place to see what differences might exist.

However, it's not my C code. I just support this Python wrapper for that TA-Lib underlying C library, and do my best to answer questions, etc. But I can't help you with the thinking of the original author or the behavior they intended.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/mrjbq7/ta-lib/issues/196#issuecomment-380921130, or mute the thread https://github.com/notifications/unsubscribe-auth/AS63dWfo18P9aWWriLTmCNvMq99Di2VTks5tn61sgaJpZM4TPue4.

bhavishyagoyal12 commented 5 years ago

It's not matching with IB PSAR as well. May i request you to let me know how can we fix that ?

Linganna commented 4 years ago

Thanks for such a great tool. It works perfectly through my research. Now I wanna use this in spark. Since I can't import talib to the online environment, I have to rewrite the indicators through python. And I tried to compare your results with mine, I can always find some differences. Could you open your source code? Looking forward hearing from you, thanks again.

HI @TerrenceVarada , hope you are doing good, we are also looking for the tech indicators calculation library which can be used on Spark where the calculation will happen on the cluster. When i saw your message was excited that u were also trying to do same, Just want to know, Have you made any progress on this library or if u can share some of your thoughts how to take it forward on this it will be great help, thank you.