twopirllc / pandas-ta

Technical Analysis Indicators - Pandas TA is an easy to use Python 3 Pandas Extension with 150+ Indicators
https://twopirllc.github.io/pandas-ta/
MIT License
5.21k stars 1.02k forks source link

ADX value from plugin vastly different from Tradingview, just me or did I mess up some entry? #522

Open whatayush opened 2 years ago

whatayush commented 2 years ago

Hi,

So first off this problem persists across multiple plugins and code. I love the pandas_ta library thus was hoping we could find a solution here.

First off here is the difference in values: image image

Here is my code:

def each_copy(ticker):
    ticker_data = pd.DataFrame.copy(df[df["ticker"]==ticker],deep=False)
    ticker_data.set_index('date',inplace=True)
    return ticker_data

test_data = each_copy("MRPL")
test_data["adx"] = ta.adx(high=test_data.high,low=test_data.low,close=test_data.close)["ADX_14"]
test_data["aDMP_14"] = ta.adx(high=test_data.high,low=test_data.low,close=test_data.close)["DMP_14"]
test_data["aDMN_14"] = ta.adx(high=test_data.high,low=test_data.low,close=test_data.close)["DMN_14"]
test_data.loc["2022-04-19 15:15:00+05:30":]

Data from the function 'each copy' is returned this way image

After adding adx data image

I hope this is enough, I did notice that the code on Tradingview is different from what we are using here. Could that be the reason? Does TV use some other formula for ADX?

I am not much of a coder thus relying on your help.

Please help and thank you.

twopirllc commented 2 years ago

Hello @whatayush,

Could you please fill out these initial questions that were asked on the Bug Report?

Which version are you running? The lastest version is on Github. Pip is for major releases.

import pandas_ta as ta
print(ta.version)

Do you have TA Lib also installed in your environment?

$ pip list

Did you upgrade? Did the upgrade resolve the issue?

$ pip install -U git+https://github.com/twopirllc/pandas-ta

Thanks, KJ

whatayush commented 2 years ago

Hi,

The version I have is 0.3.14b0. From checking on github page in my capacity, this is the latest version right?

These are the ta related installed: image

I see TA lib is missing. Could you let me know if installing it could solve the issue? Also what would be the pip install code.

Thanks

twopirllc commented 2 years ago

@whatayush,

So first off this problem persists across multiple plugins and code.

What does that mean?

The version I have is 0.3.14b0. From checking on github page in my capacity, this is the latest version right?

It is the last stable main version. I recommend trying out the development version instead since there was a fix for rma this week that may fix the issue. Note: adx by default uses rma for it's average.

$ pip install -U git+https://github.com/twopirllc/pandas-ta.git@development

I see TA lib is missing. Could you let me know if installing it could solve the issue? Also what would be the pip install code.

TA Lib is not a requirement for this library as this library is a Python version of TA Lib since TA Lib can be difficult to install in some cases. This is not a TradingView library, though I try to accommodate their implementation if it does not exist in TA Lib.

Let me know if the development version helps!

KJ

twopirllc commented 2 years ago

@whatayush,

On another note, you need not call adx three times.

def each_copy(ticker):
    ticker_data = pd.DataFrame.copy(df[df["ticker"]==ticker], deep=False)
    ticker_data.set_index('date', inplace=True)
    return ticker_data

test_data = each_copy("MRPL")
test_data.ta.adx(append=True)
test_data.rename(columns={"ADX_14": "ADX", "DMP_14": "aDMP_14", "DMN_14": "aDMN_14"}, inplace=True)

test_data.loc["2022-04-19 15:15:00+05:30":]

KJ

whatayush commented 2 years ago

Hi,

So to your first point.

I tried ADX of 3 different setups and all gave wrong values compared to TV. Yours, ta library (its called just ta) and a code I received from a course.

So I am thinking if ADX on TV is calculated differently.

On the development version, I am just wondering if I should use it. I am not a coder as such so dont want any issues to arise that could cause more problem. Your suggestion on this. If you think its okay then will go ahead.

I did test with different ma modes but the values were still not correct.

If not are there any other modules that can be taken in place of ADX?

Also TA-Lib is not getting installed, its asking for some version of visual studio which is not getting installed.

Thanks for the tip on last one. If I just want the adx value then I can simply use the first code above right? test_data["adx"] = ta.adx(high=test_data.high,low=test_data.low,close=test_data.close)["ADX_14"]

twopirllc commented 2 years ago

@whatayush,

I tried ADX of 3 different setups and all gave wrong values compared to TV. Yours, ta library (its called just ta) and a code I received from a course.

Got it

So I am thinking if ADX on TV is calculated differently.

In short, TA in general is not standardized, though it has gotten a bit better. For instance, the de facto TA Lib has different indicator initializations since broker platforms would have slight variations for whatever reason... but that's another story.

Regarding TV, fairly recently they updated their code base to version 5, so there are likely some differences since I originally wrote Pandas TA's adx and will likely need an adjustment. πŸ˜‘

On the development version, I am just wondering if I should use it. I am not a coder as such so dont want any issues to arise that could cause more problem. Your suggestion on this. If you think its okay then will go ahead.

First, the development version has lots of improvements including speed and better documentation. Second, adx internally uses: atr (which uses true_range and rma) as well as rma all of which have been updated. As of now, the best version is the development version.

Also TA-Lib is not getting installed, its asking for some version of visual studio which is not getting installed.

This is one of the primary reasons I wrote Pandas TA in the first place. Pandas TA is the best Python replica of TA Lib out there; shared indicators have r > 0.99 correlation between PTA and TA Lib.

Thanks for the tip on last one. If I just want the adx value then I can simply use the first code above right? test_data["adx"] = ta.adx(high=test_data.high,low=test_data.low,close=test_data.close)["ADX_14"]

Sure. Or

test_data["adx"] = ta.adx(high=test_data.high,low=test_data.low,close=test_data.close).iloc[:,0]

# or
test_data["adx"] = test_data.ta.adx().iloc[:,0]

KJ

whatayush commented 2 years ago

Hi,

Tried to update but getting this error. image

whatayush commented 2 years ago

Added by installing git. And testing now.

whatayush commented 2 years ago

Still the values are too high, specifically in this period. image

hakumaku commented 2 years ago

The reason adx value is different is while TradingView handles the initial value differently, pands-ta does not.

I faced this issue recently, and I don't know about the very detail of adx indicators, but I figured out where the difference comes from by looking at the code.

As you might know, rma depends on the previous value to get the current value. How to set the first value of rma seems to vary but TradingView uses the average of values.

The below is the code block of adx function. The version(master, dev) does not matter, because both do not seem to handle the first value.

def adx(
    high: Series, low: Series, close: Series,
    length: Int = None, lensig: Int = None, scalar: IntFloat = None,
    mamode: str = None, drift: Int = None,
    offset: Int = None, **kwargs: DictLike
) -> DataFrame:
    ...
    k = scalar / atr_
    dmp = k * ma(mamode, pos, length=length)
    dmn = k * ma(mamode, neg, length=length)

    dx = scalar * (dmp - dmn).abs() / (dmp + dmn)
    adx = ma(mamode, dx, length=lensig)
    ...

All of dmp, dmn and adx depend on ma("rma", ...) method.

If you delve into the function deeper, it basically uses pandas.DataFrame.ewm method.

It has adjustbool, default True parameter that you should pay attention to.

Setting it to True acts as the following: $y_t = \frac{xt + (1 - \alpha)x{t-1} + (1 - \alpha)^2 x_{t-2} + ... + (1 - \alpha)^t x_0}{1 + (1 - \alpha) + (1 - \alpha)^2 + ... + (1 - \alpha)^t}$

False which is what we want (to sync with TradingView): $\begin{split}\begin{split} y_0 &= x_0 \ yt &= (1 - \alpha) y{t-1} + \alpha x_t, \end{split}\end{split}$

It seems like it does not provide a way of handling $y_0 = x_0$ part, and I couldn't find it either. (someone asked the question at stackoverflow)

To work around this... you should hack input values a bit to get desired result:

    # Set the initial value to sum() of 'length'.
    pos.iloc[length-1] = pos[:length].sum()
    pos[:length-1] = 0
    neg.iloc[length-1] = neg[:length].sum()
    neg[:length-1] = 0
    ...
    dmp = k * pos.ewm(alpha=alpha, adjust=False, min_periods=length).mean()
    dmn = k * neg.ewm(alpha=alpha, adjust=False, min_periods=length).mean()

adx is a bit more tricky. It requires length of both dmpand dmn values (at least for TradingView). If your data starts from 06-19, the first vaue of adx will appear at 07-16 which is 27 days later assuming the length is set to 14.

    ...
    # Set the initial value to sum() of 'length'.
    dx = dx.shift(-length)
    dx.iloc[length-1] = dx[:length].sum()
    dx[:length-1] = 0

    adx = dx.ewm(alpha=alpha, adjust=False, min_periods=length).mean()
    adx = adx.shift(length) # rollback

I'm not sure this is considered to be a bug or something, but I hope my explanation helps.

hakumaku commented 2 years ago

source code

from datetime import datetime

import pandas as pd
import pandas_ta as ta

def main():
    df = pd.DataFrame(
        [
            (datetime(year=2020, month=6, day=19), 11382, 11150, 11286),
            (datetime(year=2020, month=6, day=20), 11351, 11155, 11325),
            (datetime(year=2020, month=6, day=21), 11355, 11225, 11238),
            (datetime(year=2020, month=6, day=22), 11675, 11232, 11593),
            (datetime(year=2020, month=6, day=23), 11649, 11488, 11509),
            (datetime(year=2020, month=6, day=24), 11553, 11110, 11233),
            (datetime(year=2020, month=6, day=25), 11232, 10842, 11147),
            (datetime(year=2020, month=6, day=26), 11185, 10901, 11045),
            (datetime(year=2020, month=6, day=27), 11051, 10791, 10913),
            (datetime(year=2020, month=6, day=28), 11163, 10756, 10970),
            (datetime(year=2020, month=6, day=29), 11033, 10814, 10972),
            (datetime(year=2020, month=6, day=30), 11004, 10882, 10903),
            (datetime(year=2020, month=7, day=1), 11111, 10887, 11051),
            (datetime(year=2020, month=7, day=2), 11077, 10772, 10888),
            (datetime(year=2020, month=7, day=3), 10920, 10821, 10837),
            (datetime(year=2020, month=7, day=4), 10960, 10814, 10884),
            (datetime(year=2020, month=7, day=5), 10900, 10700, 10831),
            (datetime(year=2020, month=7, day=6), 11102, 10796, 11065),
            (datetime(year=2020, month=7, day=7), 11097, 10952, 10988),
            (datetime(year=2020, month=7, day=8), 11208, 10968, 11138),
            (datetime(year=2020, month=7, day=9), 11149, 10906, 10961),
            (datetime(year=2020, month=7, day=10), 11062, 10880, 11043),
            (datetime(year=2020, month=7, day=11), 11044, 10910, 10957),
            (datetime(year=2020, month=7, day=12), 11061, 10915, 11036),
            (datetime(year=2020, month=7, day=13), 11063, 10908, 11000),
            (datetime(year=2020, month=7, day=14), 11040, 10925, 11010),
            (datetime(year=2020, month=7, day=15), 11017, 10955, 10975),
            (datetime(year=2020, month=7, day=16), 11003, 10835, 10934),
            (datetime(year=2020, month=7, day=17), 10980, 10875, 10937),
            (datetime(year=2020, month=7, day=18), 10990, 10894, 10905),
        ],
        columns=["date", "high", "low", "close"],
    )
    result = pd.concat([df["date"], df.ta.adx()], axis=1)
    print(result)

if __name__ == "__main__":
    main()

distributed version

$ pip freeze
numpy==1.23.1
pandas==1.4.3
pandas-ta==0.3.14b0
python-dateutil==2.8.2
pytz==2022.1
six==1.16.0

$ python main.py
         date    ADX_14     DMP_14     DMN_14
0  2020-06-19       NaN        NaN        NaN
1  2020-06-20       NaN        NaN        NaN
2  2020-06-21       NaN        NaN        NaN
3  2020-06-22       NaN        NaN        NaN
4  2020-06-23       NaN        NaN        NaN
5  2020-06-24       NaN        NaN        NaN
6  2020-06-25       NaN        NaN        NaN
7  2020-06-26       NaN        NaN        NaN
8  2020-06-27       NaN        NaN        NaN
9  2020-06-28       NaN        NaN        NaN
10 2020-06-29       NaN        NaN        NaN
11 2020-06-30       NaN        NaN        NaN
12 2020-07-01       NaN        NaN        NaN
13 2020-07-02       NaN        NaN        NaN
14 2020-07-03       NaN  13.619275  22.613658
15 2020-07-04       NaN  14.501944  21.164943
16 2020-07-05       NaN  13.249724  24.259245
17 2020-07-06       NaN  19.821640  21.237645
18 2020-07-07       NaN  18.637050  19.968431
19 2020-07-08       NaN  21.295372  18.046053
20 2020-07-09       NaN  19.272314  18.755545
21 2020-07-10       NaN  17.900668  18.437419
22 2020-07-11       NaN  16.944456  17.452536
23 2020-07-12       NaN  16.631818  16.423159
24 2020-07-13       NaN  15.581120  15.670945
25 2020-07-14       NaN  14.832431  14.917940
26 2020-07-15       NaN  14.429859  14.513047
27 2020-07-16  6.493720  13.370867  18.690018
28 2020-07-17  7.568603  12.741483  17.810255
29 2020-07-18  8.310010  12.638508  17.021365

dev version

$ pip freeze
numpy==1.23.1
pandas==1.4.3
pandas-ta @ git+https://github.com/twopirllc/pandas-ta.git@6596651f33a6045b2f0e6c881288bd3aff01c175
python-dateutil==2.8.2
pytz==2022.1
six==1.16.0

$ python main.py
         date     ADX_14     DMP_14     DMN_14
0  2020-06-19        NaN        NaN        NaN
1  2020-06-20        NaN        NaN        NaN
2  2020-06-21        NaN        NaN        NaN
3  2020-06-22        NaN        NaN        NaN
4  2020-06-23        NaN        NaN        NaN
5  2020-06-24        NaN        NaN        NaN
6  2020-06-25        NaN        NaN        NaN
7  2020-06-26        NaN        NaN        NaN
8  2020-06-27        NaN        NaN        NaN
9  2020-06-28        NaN        NaN        NaN
10 2020-06-29        NaN        NaN        NaN
11 2020-06-30        NaN        NaN        NaN
12 2020-07-01        NaN        NaN        NaN
13 2020-07-02  24.823779   8.726429  14.489500
14 2020-07-03  24.823779   8.491924  14.100124
15 2020-07-04  24.385023   9.265765  13.522972
16 2020-07-05  24.739785   8.738114  15.998827
17 2020-07-06  23.218992  13.651224  14.626430
18 2020-07-07  21.806827  13.078707  14.013015
19 2020-07-08  20.839146  15.385969  13.038326
20 2020-07-09  19.447701  14.301317  13.917841
21 2020-07-10  18.164087  13.531880  13.937633
22 2020-07-11  16.972160  12.978236  13.367388
23 2020-07-12  15.804952  12.917109  12.755055
24 2020-07-13  14.696557  12.274273  12.345034
25 2020-07-14  13.667333  11.804868  11.872923
26 2020-07-15  12.711625  11.548447  11.615024
27 2020-07-16  12.988707  10.860069  15.180383
28 2020-07-17  13.245998  10.441162  14.594829
29 2020-07-18  13.355360  10.440266  14.060804

fixed method

def adx(
    high: Series,
    low: Series,
    close: Series,
    length: Int = None,
    lensig: Int = None,
    scalar: IntFloat = None,
    mamode: str = None,
    drift: Int = None,
    offset: Int = None,
    **kwargs: DictLike,
) -> DataFrame:
    """Average Directional Movement (ADX)

    Average Directional Movement is meant to quantify trend strength by
    measuring the amount of movement in a single direction.

    Sources:
        TA Lib Correlation: >99%
        https://www.tradingtechnologies.com/help/x-study/technical-indicator-definitions/average-directional-movement-adx/

    Args:
        high (pd.Series): Series of 'high's
        low (pd.Series): Series of 'low's
        close (pd.Series): Series of 'close's
        length (int): It's period. Default: 14
        lensig (int): Signal Length. Like TradingView's default ADX.
            Default: length
        scalar (float): How much to magnify. Default: 100
        mamode (str): See ``help(ta.ma)``. Default: 'rma'
        drift (int): The difference period. Default: 1
        offset (int): How many periods to offset the result. Default: 0

    Kwargs:
        fillna (value, optional): pd.DataFrame.fillna(value)
        fill_method (value, optional): Type of fill method

    Returns:
        pd.DataFrame: adx, dmp, dmn columns.
    """
    # Validate
    length = v_pos_default(length, 14)
    lensig = v_pos_default(lensig, length)
    _length = max(length, lensig)
    high = v_series(high, _length)
    low = v_series(low, _length)
    close = v_series(close, _length)

    if high is None or low is None or close is None:
        return

    scalar = v_scalar(scalar, 100)
    mamode = v_mamode(mamode, "rma")
    drift = v_drift(drift)
    offset = v_offset(offset)

    # Calculate
    atr_ = atr(
        high=high,
        low=low,
        close=close,
        length=length,
        prenan=kwargs.pop("prenan", True),
    )
    if atr_ is None or all(isnan(atr_)):
        return

    up = high - high.shift(drift)  # high.diff(drift)
    dn = low.shift(drift) - low  # low.diff(-drift).shift(drift)

    pos = ((up > dn) & (up > 0)) * up
    neg = ((dn > up) & (dn > 0)) * dn

    pos = pos.apply(zero)
    neg = neg.apply(zero)

    pos.iloc[length - 1] = pos[:length].sum()
    pos[: length - 1] = 0
    neg.iloc[length - 1] = neg[:length].sum()
    neg[: length - 1] = 0

    k = scalar / atr_
    alpha = 1 / length
    dmp = k * pos.ewm(alpha=alpha, adjust=False, min_periods=length).mean()
    dmn = k * neg.ewm(alpha=alpha, adjust=False, min_periods=length).mean()

    dx = scalar * (dmp - dmn).abs() / (dmp + dmn)
    dx = dx.shift(-length)
    dx.iloc[length - 1] = dx[:length].sum()
    dx[: length - 1] = 0

    adx = ma(mamode, dx, length=lensig)
    adx[: length - 1] = np.nan
    adx = adx.shift(length)

    # Offset
    if offset != 0:
        dmp = dmp.shift(offset)
        dmn = dmn.shift(offset)
        adx = adx.shift(offset)

    # Fill
    if "fillna" in kwargs:
        adx.fillna(kwargs["fillna"], inplace=True)
        dmp.fillna(kwargs["fillna"], inplace=True)
        dmn.fillna(kwargs["fillna"], inplace=True)
    if "fill_method" in kwargs:
        adx.fillna(method=kwargs["fill_method"], inplace=True)
        dmp.fillna(method=kwargs["fill_method"], inplace=True)
        dmn.fillna(method=kwargs["fill_method"], inplace=True)

    # Name and Category
    adx.name = f"ADX_{lensig}"
    dmp.name = f"DMP_{length}"
    dmn.name = f"DMN_{length}"
    adx.category = dmp.category = dmn.category = "trend"

    data = {adx.name: adx, dmp.name: dmp, dmn.name: dmn}
    adxdf = DataFrame(data)
    adxdf.name = f"ADX_{lensig}"
    adxdf.category = "trend"

    return adxdf
$ python modified.py
         date     ADX_14     DMP_14     DMN_14
0  2020-06-19        NaN        NaN        NaN
1  2020-06-20        NaN        NaN        NaN
2  2020-06-21        NaN        NaN        NaN
3  2020-06-22        NaN        NaN        NaN
4  2020-06-23        NaN        NaN        NaN
5  2020-06-24        NaN        NaN        NaN
6  2020-06-25        NaN        NaN        NaN
7  2020-06-26        NaN        NaN        NaN
8  2020-06-27        NaN        NaN        NaN
9  2020-06-28        NaN        NaN        NaN
10 2020-06-29        NaN        NaN        NaN
11 2020-06-30        NaN        NaN        NaN
12 2020-07-01        NaN        NaN        NaN
13 2020-07-02        NaN  14.064555  22.560271
14 2020-07-03        NaN  13.686598  21.954010
15 2020-07-04        NaN  14.247809  21.055379
16 2020-07-05        NaN  13.436449  23.102292
17 2020-07-06        NaN  17.946530  21.120552
18 2020-07-07        NaN  17.193874  20.234781
19 2020-07-08        NaN  19.214901  18.827331
20 2020-07-09        NaN  17.860325  19.298744
21 2020-07-10        NaN  16.899406  19.029033
22 2020-07-11        NaN  16.207983  18.250478
23 2020-07-12        NaN  15.998908  17.414460
24 2020-07-13        NaN  15.202702  16.772558
25 2020-07-14        NaN  14.621306  16.131126
26 2020-07-15        NaN  14.303707  15.780731
27 2020-07-16   9.874338  13.451093  19.097781
28 2020-07-17  10.408195  12.932243  18.361121
29 2020-07-18  10.799274  12.840198  17.689287
twopirllc commented 2 years ago

Hi @hakumaku,

I have a lot of πŸ”₯s, so I appreciate the explanation and code to help to fix this Issue. 😎

Since your code meets ⬇️, please make a PR. 😎

This is not a TradingView library, though I try to accommodate their implementation if it does not exist in TA Lib.

Kind Regards, KJ

anon2010 commented 1 year ago

How can I implement this fix? Swap out Hakumaku's def adx method for the one in adx.py ?

char101 commented 10 months ago

According to https://school.stockcharts.com/doku.php?id=technical_indicators:average_directional_index_adx (with is pretty much what is written in Welles Wilder's book, ADX is not smoothed using EMA), instead it is calculated as

First ADX14 = 14 period Average of DX 
Second ADX14 = ((First ADX14 x 13) + Current DX Value)/14 
Subsequent ADX14 = ((Prior ADX14 x 13) + Current DX Value)/14

Rather than EMA, this is the Volatility Index equation.