Open akaniklaus opened 6 years ago
hey, normalizing candle data is heavily debated online. simple minmax normalization (with min max looking back at a certain length of data) is not recommended. some methods of normalizing and impact on forecasting via NN's are described here:
you will find much more info on it online and will have to choose a method
for oscillators with a defined minmax range you can use a mapping function like (C++):
`
template <typename T>
static T map(const T &value,
const T &inputMin, const T &inputMax,
const T &outputMin, const T &outputMax,
bool clamp=false) {
if (std::abs(inputMin - inputMax) < DBL_EPSILON) {
return outputMin;
}
else {
T outVal = ((value - inputMin) / (inputMax - inputMin) * (outputMax - outputMin) + outputMin);
if (clamp) {
if (outputMax < outputMin) {
if (outVal < outputMax)outVal = outputMax;
else if (outVal > outputMin)outVal = outputMin;
}
else {
if (outVal > outputMax)outVal = outputMax;
else if (outVal < outputMin)outVal = outputMin;
}
}
return outVal;
}
}
`
@johanjohan Thanks a lot for these useful resources. I was just trying to scale between [0.0, 1.0] ones that have specific intervals. In case of others, I believe that it is important to normalize with the Close price because otherwise they wouldn't make sense, especially when having data from multiple assets. I wrote the following script to extract all such multi-period features that could be available from TA-Lib:
func_groups = ta.get_function_groups()
selected_funcs = func_groups['Momentum Indicators']
selected_funcs += func_groups['Volatility Indicators']
selected_funcs += func_groups['Cycle Indicators']
selected_funcs += func_groups['Overlap Studies']
#selected_funcs += func_groups['Volume Indicators']
selected_funcs += func_groups['Statistic Functions']
selected_funcs = set(selected_funcs) - set(['MAVP'])
selected_funcs = set(selected_funcs) | set(['ADOSC'])
def extract_tafeatures(market, d_period):
input_arrays = pd.DataFrame(polo.returnChartData(market, d_period, round(time()) - 2678400))
row = []
for func_name in selected_funcs:
print func_name
abs_func = abstract.Function(func_name)
denom = input_arrays['close'].tail(1) if func_name in normalizeClose else 1.0
result = abs_func(input_arrays).iloc[-1] / denom
if func_name in normalize360:
result /= 360.0
if func_name in normalize100:
result /= 100.0
print(result)
if isinstance(result, pd.Series):
row.extend(result.values)
else:
row.append(result)
return pd.Series(row)
ff = extract_tafeatures('BTC_ETH', 900).dropna()
ff = np.append(ff, extract_tafeatures('BTC_ETH', 1800).dropna())
ff = np.append(ff, extract_tafeatures('BTC_ETH', 7200).dropna())
ff = np.append(ff, extract_tafeatures('BTC_ETH', 14400).dropna())
@akaniklaus thx, that is a helpful listing. my first link is referring to normalizing close prices, where min max are unknown. normalizing close prices is a necessity in AI/genetic algos, so best advice may be found right there. i currently test the tanh estimator described in the pdf for normalizing close prices. simple normalizing close prices with min max is not recommended.
I don't have time right now to help but I would appreciate any useful categorization being contributed back to the talib
API!
Dear Sir/Madam,
I am having difficulties in normalizing outputs of TA-Lib, as there are no clear documentation about ranges of every function in the library. I would welcome if you can share a full-list about function ranges. I am trying to classify them as ones require to be normalized with the Close price, ones that needs to be normalized with 100, and then finally I should divide these groups further into ones that are in [0.0, 1.0] and [-1.0, 1.0] ranges. I would be glad if you can help about this function classification. I have done some of them as below but there might be mistakes as I made them by assumptions.
normalizeClose = {'EMA', 'DEMA', 'MIDPOINT', 'MIDPRICE', 'SAREXT', 'LINEARREG_INTERCEPT', 'SMA', 'BBANDS', 'TRIMA', 'TEMA', 'KAMA', 'PLUS_DM', 'MINUS_DM', 'T3', 'SAR', 'VAR', 'MA', 'WMA', 'LINEARREG', 'MAMA', 'TSF', 'HT_TRENDLINE', 'STDDEV'}
normalize360 = {'HT_DCPHASE', 'HT_PHASOR', 'HT_DCPERIOD'}
normalize100 = {'CMO', 'STOCHF', 'MINUS_DI', 'CCI', 'DX', 'TRANGE', 'ROCR100', 'MFI', 'PLUS_DI', 'AROON', 'LINEARREG_ANGLE', 'WILLR', 'ULTOSC', 'MOM', 'ADX', 'LINEARREG_SLOPE', 'MACD', 'MACDEXT', 'STOCH', 'MACDFIX', 'AROONOSC', 'RSI', 'ADXR', 'APO', 'ATR', 'STOCHRSI', 'ADOSC'}
Sincerely, Kamer