TA-Lib / ta-lib-python

Python wrapper for TA-Lib (http://ta-lib.org/).
http://ta-lib.github.io/ta-lib-python
Other
9.73k stars 1.76k forks source link

Normalizing Outputs #183

Open akaniklaus opened 6 years ago

akaniklaus commented 6 years ago

Dear Sir/Madam,

I am having difficulties in normalizing outputs of TA-Lib, as there are no clear documentation about ranges of every function in the library. I would welcome if you can share a full-list about function ranges. I am trying to classify them as ones require to be normalized with the Close price, ones that needs to be normalized with 100, and then finally I should divide these groups further into ones that are in [0.0, 1.0] and [-1.0, 1.0] ranges. I would be glad if you can help about this function classification. I have done some of them as below but there might be mistakes as I made them by assumptions.

normalizeClose = {'EMA', 'DEMA', 'MIDPOINT', 'MIDPRICE', 'SAREXT', 'LINEARREG_INTERCEPT', 'SMA', 'BBANDS', 'TRIMA', 'TEMA', 'KAMA', 'PLUS_DM', 'MINUS_DM', 'T3', 'SAR', 'VAR', 'MA', 'WMA', 'LINEARREG', 'MAMA', 'TSF', 'HT_TRENDLINE', 'STDDEV'}

normalize360 = {'HT_DCPHASE', 'HT_PHASOR', 'HT_DCPERIOD'}

normalize100 = {'CMO', 'STOCHF', 'MINUS_DI', 'CCI', 'DX', 'TRANGE', 'ROCR100', 'MFI', 'PLUS_DI', 'AROON', 'LINEARREG_ANGLE', 'WILLR', 'ULTOSC', 'MOM', 'ADX', 'LINEARREG_SLOPE', 'MACD', 'MACDEXT', 'STOCH', 'MACDFIX', 'AROONOSC', 'RSI', 'ADXR', 'APO', 'ATR', 'STOCHRSI', 'ADOSC'}

Sincerely, Kamer

johanjohan commented 6 years ago

hey, normalizing candle data is heavily debated online. simple minmax normalization (with min max looking back at a certain length of data) is not recommended. some methods of normalizing and impact on forecasting via NN's are described here:

you will find much more info on it online and will have to choose a method

johanjohan commented 6 years ago

for oscillators with a defined minmax range you can use a mapping function like (C++):

`

template <typename T>
static T map(const T &value,
             const T &inputMin, const T &inputMax,
             const T &outputMin, const T &outputMax,
             bool    clamp=false) {

    if (std::abs(inputMin - inputMax) < DBL_EPSILON) {
        return outputMin;
    }
    else {
        T outVal = ((value - inputMin) / (inputMax - inputMin) * (outputMax - outputMin) + outputMin);

        if (clamp) {
            if (outputMax < outputMin) {
                if (outVal < outputMax)outVal = outputMax;
                else if (outVal > outputMin)outVal = outputMin;
            }
            else {
                if (outVal > outputMax)outVal = outputMax;
                else if (outVal < outputMin)outVal = outputMin;
            }
        }
        return outVal;
    }
}

`

akaniklaus commented 6 years ago

@johanjohan Thanks a lot for these useful resources. I was just trying to scale between [0.0, 1.0] ones that have specific intervals. In case of others, I believe that it is important to normalize with the Close price because otherwise they wouldn't make sense, especially when having data from multiple assets. I wrote the following script to extract all such multi-period features that could be available from TA-Lib:

func_groups = ta.get_function_groups()
selected_funcs = func_groups['Momentum Indicators']
selected_funcs += func_groups['Volatility Indicators']
selected_funcs += func_groups['Cycle Indicators']
selected_funcs += func_groups['Overlap Studies']
#selected_funcs += func_groups['Volume Indicators']
selected_funcs += func_groups['Statistic Functions']
selected_funcs = set(selected_funcs) - set(['MAVP'])
selected_funcs = set(selected_funcs) | set(['ADOSC'])

def extract_tafeatures(market, d_period):
    input_arrays = pd.DataFrame(polo.returnChartData(market, d_period, round(time()) - 2678400))
    row = []
    for func_name in selected_funcs:
        print func_name
        abs_func = abstract.Function(func_name)
        denom = input_arrays['close'].tail(1) if func_name in normalizeClose else 1.0
        result = abs_func(input_arrays).iloc[-1] / denom
        if func_name in normalize360:
            result /= 360.0
        if func_name in normalize100:
            result /= 100.0
        print(result)
        if isinstance(result, pd.Series):
            row.extend(result.values)
        else:
            row.append(result)
    return pd.Series(row)

ff = extract_tafeatures('BTC_ETH', 900).dropna()
ff = np.append(ff, extract_tafeatures('BTC_ETH', 1800).dropna())
ff = np.append(ff, extract_tafeatures('BTC_ETH', 7200).dropna())
ff = np.append(ff, extract_tafeatures('BTC_ETH', 14400).dropna())
johanjohan commented 6 years ago

@akaniklaus thx, that is a helpful listing. my first link is referring to normalizing close prices, where min max are unknown. normalizing close prices is a necessity in AI/genetic algos, so best advice may be found right there. i currently test the tanh estimator described in the pdf for normalizing close prices. simple normalizing close prices with min max is not recommended.

mrjbq7 commented 6 years ago

I don't have time right now to help but I would appreciate any useful categorization being contributed back to the talib API!