indicators show different values on windows of different sizes

mytarmail commented 8 months ago

Description

Hello! If I haven't made any mistake in my calculations. I found that the indicators show different values on windows of different sizes, this is not good because you cannot trust the results of previously trained models.

Expected behavior

It would be cool to sort this out since the results are misleading :)

Minimal, reproducible example

library(xts)
library(quantmod)
library(TTR)
n <- 100000
set.seed(1)
#######################################
#########  make some prices ###########
#######################################
volume <- sample(1:10,n,replace = TRUE,prob = c(10:1))
prices <- round(cumsum(rnorm(n,sd = 0.01)),2)+1000

p <-  cbind(price = prices, volume = volume) |> 
      xts(as.POSIXct("2000-01-01 00:00:00 EET")-n:1) |> 
      to.minutes(name = "my") 
colnames(p) <- gsub("^my\\.", "", colnames(p))
# head(p)
# chart_Series(tail(p,200))

############################################################
#########  function for calculate all indicators ###########
############################################################

get_all_indicators <- function(p, n= 10){
  CLOSE <- quantmod::Cl(p)
  OHLC <-  quantmod::OHLC(p)
  HLC <-   quantmod::HLC(p)
  HL <-    quantmod::HL(p)
  VOLUME <- quantmod::Vo(p)

  f <- function(x) unname(coredata(x))

  data.frame(

    chaikinVolatility = TTR::chaikinVolatility(HL = HL, n = n) |> f(),
    #DPO =   TTR::DPO(x = CLOSE, n = n) |> f(),
    EMA =   TTR::EMA(x = CLOSE, n = n) |> f(),
    momentum = TTR::momentum(CLOSE, n = n) |> f(),
    SNR =   TTR::SNR(HLC = HLC, n = n) |> f(),
    WPR = TTR::WPR(HLC = HLC, n = n) |> f(),

    ADX =    TTR::ADX(HLC = HLC, n = n),
    ALMA =   TTR::ALMA(OHLC, n = n),
    aroon =  TTR::aroon(HL = HL, n = n),
    ATR =    TTR::ATR(HLC = HLC, n = n),
    BBands = TTR::BBands(HLC = HLC, n = n),
    CCI =    TTR::CCI(HLC = HLC, n = n),
    chaikinAD = TTR::chaikinAD(HLC = HLC, volume = VOLUME),
    CLV =   TTR::CLV(HLC = HLC),
    CMF =   TTR::CMF(HLC = HLC, n = n, volume = VOLUME),
    CMO =   TTR::CMO(x = CLOSE, n = n),
    CTI =   TTR::CTI(price = CLOSE, n = n),
    DEMA =  TTR::DEMA(x = CLOSE,n = n),
    DonchianChannel = TTR::DonchianChannel(HL = HL,n = n),
    DVI =   TTR::DVI(price = CLOSE, n = n),
    EVWMA = TTR::EVWMA(price = CLOSE, n = n, volume = VOLUME),
    HMA =   TTR::HMA(CLOSE, n = n),
    keltnerChannels = TTR::keltnerChannels(HLC = HLC, n = n),
    KST =   TTR::KST(CLOSE, n = n),
    MACD =  TTR::MACD(x = CLOSE,nFast = n, nSlow = n*2),
    MFI =   TTR::MFI(HLC = HLC, n = n, volume = VOLUME),
    OBV =   TTR::OBV(CLOSE,volume = VOLUME),
    PBands =TTR::PBands(CLOSE, n = n),
    ROC =   TTR::ROC(OHLC, n = n),
    RSI =   TTR::RSI(CLOSE, n),
    SAR =   TTR::SAR(HL = HL),
    SMA =   TTR::SMA(x = CLOSE, n = n),
    SMI =   TTR::SMI(HLC = HLC, n = n, nFast = n, nSlow = n*2),
    stoch = TTR::stoch(HLC = HLC, nFastK = n, nFastD = n, nSlowD = n*2),
    TDI =   TTR::TDI(CLOSE,n = n),
    TR =    TTR::TR(HLC = HLC),
    TRIX =  TTR::TRIX(price = CLOSE, n = n),
    VHF =   TTR::VHF(price = CLOSE, n = n),
    volatility = TTR::volatility(OHLC = OHLC, n = n),
    williamsAD = TTR::williamsAD(HLC = HLC)
  )
}

############################################################
######### Making a comparison on different windows #########
############################################################

last_row_from_window500 <- tail(p,500) |> get_all_indicators(n = 10) |> tail(1)
last_row_from_window200 <- tail(p,200) |> get_all_indicators(n = 10) |> tail(1)

TwolastRows <- rbind.data.frame(last_row_from_window500,
                                last_row_from_window200)
#print(TwolastRows)

# Which indicators do not correspond to each other
colnames(last_row_from_window500)[  apply(TwolastRows,2, \(x)  x[1]!=x[2]) ]

[1] "SNR"                "ADX.DIp"            "ADX.DIn"            "ADX.DX"            
 [5] "ADX.ADX"            "ATR.atr"            "BBands.dn"          "BBands.mavg"       
 [9] "BBands.up"          "BBands.pctB"        "cci"                "chaikinAD"         
[13] "CMF"                "EVWMA"              "keltnerChannels.dn" "keltnerChannels.up"
[17] "MACD.macd"          "MACD.signal"        "mfi"                "obv"               
[21] "PBands.dn"          "PBands.center"      "PBands.up"          "rsi"               
[25] "SMA"                "SMI.SMI"            "SMI.signal"         "stoch.slowD"       
[29] "TRIX.signal"        "williamsAD"

Session Info

sessionInfo()
R version 4.3.2 (2023-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 17763)

Matrix products: default

time zone: Europe/Kiev
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] quantmod_0.4.25 TTR_0.24.4      xts_0.13.1      zoo_1.8-12     

loaded via a namespace (and not attached):
[1] compiler_4.3.2 tools_4.3.2    curl_5.2.0     grid_4.3.2     lattice_0.21-9

joshuaulrich commented 8 months ago

Your example has almost 40 indicators. Can you please provide a minimal example with the indicators you think have an issue?

Your comment says that the SMA() result is different based on the number of observations in the window. That's extremely unlikely. Also note that recursive indicators like EMA() and indicators that use them (e.g. RSI()) are unlikely to have the same values for different window sizes. See the Warning section in ?TTR::MovingAverages.

mytarmail commented 8 months ago

Your example has almost 40 indicators. Can you please provide a minimal example with the indicators you think have an issue?

Your comment says that the SMA() result is different based on the number of observations in the window. That's extremely unlikely. Also note that recursive indicators like EMA() and indicators that use them (e.g. RSI()) are unlikely to have the same values for different window sizes. See the Warning section in ?TTR::MovingAverages.

Hi! Here is a shorter example where all three indicators work differently on different windows

library(xts)
library(quantmod)
library(TTR)
n <- 100000
set.seed(1)
#######################################
#########  make some prices ###########
#######################################
volume <- sample(1:10,n,replace = TRUE,prob = c(10:1))
prices <- round(cumsum(rnorm(n,sd = 0.01)),2)+1000

p <-  cbind(price = prices, volume = volume) |> 
  xts(as.POSIXct("2000-01-01 00:00:00 EET")-n:1) |> 
  to.minutes(name = "my") 
colnames(p) <- gsub("^my\\.", "", colnames(p))

############################################################
#########  function for calculate all indicators ###########
############################################################

MINI_get_all_indicators <- function(p, n= 10){
  CLOSE <- quantmod::Cl(p)
  OHLC <-  quantmod::OHLC(p)
  HLC <-   quantmod::HLC(p)
  HL <-    quantmod::HL(p)
  VOLUME <- quantmod::Vo(p)

  data.frame(
    stoch = TTR::stoch(HLC = HLC, nFastK = n, nFastD = n, nSlowD = n*2),
    TRIX =  TTR::TRIX(price = CLOSE, n = n),
    williamsAD = TTR::williamsAD(HLC = HLC)
  )
}

############################################################
######### Making a comparison on different windows #########
############################################################

last_row_from_window500 <- tail(p,500) |> MINI_get_all_indicators(n = 10) |> tail(1)
last_row_from_window200 <- tail(p,200) |> MINI_get_all_indicators(n = 10) |> tail(1)

TwolastRows <- rbind.data.frame(last_row_from_window500,
                                last_row_from_window200)
#print(TwolastRows)

# Which indicators do not correspond to each other
colnames(last_row_from_window500)[  apply(TwolastRows,2, \(x)  x[1]!=x[2]) ]

[1] "stoch.slowD" "TRIX.signal" "williamsAD"

joshuaulrich commented 8 months ago

Thanks! I just noticed that you're checking whether two numbers are exactly equal (x[1] != x[2]). That's subject to floating point precision error (see FAQ 7.31). Use this instead:

num_diff <- function(x) { abs(diff(x)) > sqrt(.Machine$double.eps) }
(last_row_from_window500)[apply(TwolastRows, 2, num_diff)]

Once you do that, very few of the final indicator values are different. Here are the results using all 40-ish indicators:

print(diffs <- last_row_from_window500[apply(TwolastRows, 2, num_diff)])
##                      ADX.ADX chaikinAD   obv   SMI.SMI SMI.signal williamsAD
## 1999-12-31 23:59:59 35.06024 -4886.041 -2129 -55.81354  -49.64856      -1.43

TwolastRows[, colnames(diffs)]
##                       ADX.ADX chaikinAD   obv   SMI.SMI SMI.signal williamsAD
## 1999-12-31 23:59:59  35.06024 -4886.041 -2129 -55.81354  -49.64856      -1.43
## 1999-12-31 23:59:591 35.06024 -2925.638  -124 -55.81354  -49.64856      -0.55

So it looks like only chaikinAD(), OBV(), and williamsAD() are different. All 3 of those functions use the cumulative sum of the series, so it makes sense that they would be different depending on the start of the series. That's not something I can fix.

ADX and SMI look close and are probably just a bit smaller than floating point precision error.

mytarmail commented 8 months ago

Thank you! You are amazing!!

joshuaulrich / TTR