hafen / stlplus

Seasonal-Trend Decomposition using Loess (STL) in R
Other
64 stars 12 forks source link

Infinite values can sometimes crash the R session #10

Closed lutzvdb closed 10 months ago

lutzvdb commented 1 year ago

I have noticed that in certain situations, having infinite values in the input data can crash the R session. Interestingly, this does not occur every single time, and not on every machine. On multiple production machines, running R version 4.2.0 (2022-04-22 ucrt) and Windows Server 2019, it crashes about every third time. On a different machine, this time running R version 4.0.2 (2020-06-22) and Windows 12, it does not seem to crash. I have attached a minimal example that is able to crash multiple of my machines repeatedly.

I suggest including a stopping criterion in stlplus if infinite values are found, especially seeing how the entire decomposition doesn't make much sense to begin with when there are infinite values present.

Minimal example:

library(data.table)
library(stlplus)
dt <- fread("bug.csv")

# You may have to run stlplus() a few times to encounter the bug
for(i in 1:100) {
  decomp <- stlplus(dt$x, n.p = 344, s.window = "periodic")
}

bug.csv

hafen commented 10 months ago

Thanks for this report and for providing an example. Since it doesn't make sense to smooth infinite values, I added a preprocessing step to replace infinite values with NA. I was not able to reproduce this on my machine, so if you can try the updated package on the machine where you were experiencing the error and report back, that would be great.

You can install the updated package with:

remotes::install_github("hafen/stlplus@10-infinite-crash")
lutzvdb commented 10 months ago

Unfortunately, I can't reproduce the original reported error on my personal machine either and I don't have access to the production machines used in the bug report anymore. Therefore, I can't validate your proposed solution. However, based on my statements in the original bug report, it would seem that simply replacing infinite values with NAs would be an adequate solution.

Perhaps a warning could also be introduced? I'd consider it highly unlikely that the person calling stlplus with an infinite value in the time series is aware of the infinite value and wanting it to be there.

Cheers

hafen commented 10 months ago

Good point thanks I have updated accordingly.