facebook / prophet

Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
https://facebook.github.io/prophet
MIT License
18.14k stars 4.49k forks source link

[R] Conflict with the future/future.apply packages #1137

Closed RamiKrispin closed 4 years ago

RamiKrispin commented 4 years ago

Hello,

I started recently to get an error when running backtesting for prophet model using future.apply function. It was working fine, I probably update one of the packages on the backend of prophet (I added my sessionInfo() output on the bottom. This error occurred only when calling the prophet function explicitly using the :: operator (e.g., prophet::prophet(...))

I created a reputable example below using the AirPassengers dataset:

library(prophet) # this will load Rcpp and rlang packages to the env
library(TSstudio) # For converting ts object to data.frame

data("AirPassengers")

# Converting the object to prophet object
df <- ts_to_prophet(AirPassengers)

# Setting the forecast horizon
h <- 12 

# Setting the backtesting breaks
start <- nrow(df) -  3 * 6 
s <- seq(from = 144, by = -3, length.out = 6)

This won't work:

future::plan(future::multiprocess, workers = 6) 
prophet_backtesting_parallel <- future.apply::future_lapply(rev(s), function(i){

  df_sub <- df[1:i,]

  train <- df_sub[1:(i - h), ]
  test <- df_sub[(i - h + 1):i, ]

  md <- prophet::prophet(df = train,  
                yearly.seasonality = 6,
                weekly.seasonality = FALSE, 
                daily.seasonality = FALSE)
  future <- prophet::make_future_dataframe(md, periods = 12)
  fc <- predict(md, future)

  return(fc)
})

I am getting the following error: failed to create the optimizer; optimization not done Error in if (stan.fit$return_code != 0) { : argument is of length zero

On the other hand, when not calling directly the prophet function (without using the :: operator) it will work:

future::plan(future::multiprocess, workers = 6) 
prophet_backtesting_parallel <- future.apply::future_lapply(rev(s), function(i){

  df_sub <- df[1:i,]

  train <- df_sub[1:(i - h), ]
  test <- df_sub[(i - h + 1):i, ]

  md <- prophet(df = train,  
                         yearly.seasonality = 6,
                         weekly.seasonality = FALSE, 
                         daily.seasonality = FALSE)
  future <- prophet::make_future_dataframe(md, periods = 12)
  fc <- predict(md, future)

  return(fc)

})

It works without any issues when using normal lapply function:

prophet_backtesting <- lapply(rev(s), function(i){

  df_sub <- df[1:i,]

  train <- df_sub[1:(i - h), ]
  test <- df_sub[(i - h + 1):i, ]

  md <- prophet::prophet(df = train, 
                yearly.seasonality = 6, 
                weekly.seasonality = FALSE, 
                daily.seasonality = FALSE)
  future <- prophet::make_future_dataframe(md, periods = 12)
  fc <- predict(md, future)
  return(fc)
})

Any suggestions for how to overcome this issue?

my sessionInfo() output:

R version 3.6.0 (2019-04-26)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.6

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] prophet_0.5         rlang_0.4.0         TSstudio_0.1.4.9000 Rcpp_1.0.1         

loaded via a namespace (and not attached):
 [1] pillar_1.4.2           compiler_3.6.0         xts_0.11-2             prettyunits_1.0.2      tools_3.6.0           
 [6] digest_0.6.20          packrat_0.5.0-20       pkgbuild_1.0.3         lubridate_1.7.4        lattice_0.20-38       
[11] tibble_2.1.3           gtable_0.3.0           pkgconfig_2.0.2        cli_1.1.0              rstudioapi_0.10       
[16] parallel_3.6.0         loo_2.1.0               gridExtra_2.3          extraDistr_1.8.11     
[21] stringr_1.4.0          dplyr_0.8.3            globals_0.12.4         stats4_3.6.0           grid_3.6.0            
[26] tidyselect_0.2.5       glue_1.3.1             inline_0.3.15          listenv_0.7.0          R6_2.4.0              
[31] processx_3.4.0         future.apply_1.3.0     rstan_2.19.2           tidyr_0.8.3            callr_3.3.0           
[36] purrr_0.3.2            ggplot2_3.2.1          magrittr_1.5           scales_1.0.0           ps_1.3.0              
[41] codetools_0.2-16       StanHeaders_2.19.0     matrixStats_0.54.0     assertthat_0.2.1       future_1.14.0         
[46] colorspace_1.4-1       stringi_1.4.3          lazyeval_0.2.2         munsell_0.5.0          crayon_1.3.4          
[51] zoo_1.8-6      

Thank you in advance, Rami

bletham commented 4 years ago

I'm not exactly sure what's happening here. There are situations in which prophet::prophet does not work due to Rcpp issues (#285) , which could be related to the failed to create the optimizer error. However I'm actually not able to reproduce this. For me, this code works:

library(prophet)
library(future)
library(future.apply)

df <- read.csv('example_wp_log_peyton_manning.csv')

# Setting the forecast horizon
h <- 12 

# Setting the backtesting breaks
start <- nrow(df) -  3 * 6 
s <- seq(from = 144, by = -3, length.out = 6)

future::plan(future::multiprocess, workers = 6) 
prophet_backtesting_parallel <- future.apply::future_lapply(rev(s), function(i){

  df_sub <- df[1:i,]

  train <- df_sub[1:(i - h), ]
  test <- df_sub[(i - h + 1):i, ]

  md <- prophet::prophet(df = train,  
                yearly.seasonality = 6,
                weekly.seasonality = FALSE, 
                daily.seasonality = FALSE)
  future <- prophet::make_future_dataframe(md, periods = 12)
  fc <- predict(md, future)

  return(fc)
})

so I'm wonder if the ts_to_prophet function might be creating a dataframe that is somehow different? Could you try running the code I posted with this CSV (https://github.com/facebook/prophet/blob/master/examples/example_wp_log_peyton_manning.csv) and see if it works?

RamiKrispin commented 4 years ago

Hi @bletham, I am still getting the same error when running your example above. It does work when calling the function without the package reference (e.g., prophet(df = train... ) as opposed to prophet::prophet(df = train...)). Could it be related to the OS or the versions of the compilers?

bletham commented 4 years ago

I wonder if it could be different package versions. Could you post the output of this:

library(prophet)
library(future)
library(future.apply)

sessionInfo()

which for me is

R version 3.4.4 (2018-03-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.2 LTS

Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] future.apply_1.3.0 future_1.14.0      prophet_0.5        rlang_0.4.0       
[5] Rcpp_1.0.2        

loaded via a namespace (and not attached):
 [1] pillar_1.4.2       compiler_3.4.4     tools_3.4.4        prettyunits_1.0.2 
 [5] digest_0.6.21      pkgbuild_1.0.5     tibble_2.1.3       gtable_0.3.0      
 [9] pkgconfig_2.0.3    cli_1.1.0          parallel_3.4.4     loo_2.1.0         
[13] gridExtra_2.3      dplyr_0.8.3        globals_0.12.4     stats4_3.4.4      
[17] grid_3.4.4         tidyselect_0.2.5   glue_1.3.1         inline_0.3.15     
[21] listenv_0.7.0      R6_2.4.0           processx_3.4.1     rstan_2.19.2      
[25] ggplot2_3.2.1      callr_3.3.2        purrr_0.3.2        magrittr_1.5      
[29] scales_1.0.0       ps_1.3.0           codetools_0.2-15   StanHeaders_2.19.0
[33] matrixStats_0.55.0 assertthat_0.2.1   colorspace_1.4-1   lazyeval_0.2.2    
[37] munsell_0.5.0      crayon_1.3.4      
RamiKrispin commented 4 years ago

Here is the output of the sessionInfo() on my machine:

sessionInfo()
R version 3.6.0 (2019-04-26)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.6

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] future.apply_1.3.0 future_1.14.0      prophet_0.5        rlang_0.4.0       
[5] Rcpp_1.0.2        

loaded via a namespace (and not attached):
 [1] pillar_1.4.2       compiler_3.6.0     prettyunits_1.0.2  tools_3.6.0       
 [5] digest_0.6.20      packrat_0.5.0-20   pkgbuild_1.0.3     tibble_2.1.3      
 [9] gtable_0.3.0       pkgconfig_2.0.2    cli_1.1.0          rstudioapi_0.10   
[13] parallel_3.6.0     loo_2.1.0          gridExtra_2.3      dplyr_0.8.3       
[17] globals_0.12.4     stats4_3.6.0       grid_3.6.0         tidyselect_0.2.5  
[21] glue_1.3.1         inline_0.3.15      listenv_0.7.0      R6_2.4.0          
[25] processx_3.4.0     rstan_2.19.2       callr_3.3.0        purrr_0.3.2       
[29] ggplot2_3.2.1      magrittr_1.5       scales_1.0.0       ps_1.3.0          
[33] codetools_0.2-16   StanHeaders_2.19.0 matrixStats_0.54.0 assertthat_0.2.1  
[37] colorspace_1.4-1   lazyeval_0.2.2     munsell_0.5.0      crayon_1.3.4    
bletham commented 4 years ago

Oh, I think I see what is happening - in my code I had done library(prophet). If I run it without first importing the library, then I do get

Error in cpp_object_initializer(.self, .refClassDef, ...) : 
  could not find function "cpp_object_initializer"
failed to create the optimizer; optimization not done
Error in if (stan.fit$return_code != 0) { : argument is of length zero

I expect this is from #285, which is that you have to do library(Rcpp) (or equivalently, load prophet or rstan) before you can do prophet::prophet.

RamiKrispin commented 4 years ago

Hi @bletham, issue #285, as you mention, occurs as a result of the dependency of prophet on Rcpp. This is not the case here as I am getting the error although I am loading the package (prophet) prior for running the code. This error occurs only when using both future.apply() and prophet::prophet() together. On any other combination (e.g., using lapply() with prophet::prophet()or future.apply without ::) it worked well.

RamiKrispin commented 4 years ago

BTW, the error I am getting when running the examples above is not unique to my machine as I getting the same error also when running it on Windows 7 OS.

bletham commented 4 years ago

I see, in that case I'm unable to reproduce. This code works for me, tested on a couple different Linux systems:

library(prophet)
library(TSstudio)

data("AirPassengers")
df <- ts_to_prophet(AirPassengers)

h <- 12 
start <- nrow(df) -  3 * 6 
s <- seq(from = 144, by = -3, length.out = 6)

future::plan(future::multiprocess, workers = 6) 
prophet_backtesting_parallel <- future.apply::future_lapply(rev(s), function(i){

  df_sub <- df[1:i,]

  train <- df_sub[1:(i - h), ]
  test <- df_sub[(i - h + 1):i, ]

  md <- prophet::prophet(df = train,  
                yearly.seasonality = 6,
                weekly.seasonality = FALSE, 
                daily.seasonality = FALSE)
  future <- prophet::make_future_dataframe(md, periods = 12)
  fc <- predict(md, future)

  return(fc)
})

If I remove library(prophet) then I get the error reported, but do not if it is there.

We have the same version of all of the packages that I'd expect to be relevant. Does it work for you if you entirely remove prophet::? Specifically this code:

library(prophet)
library(TSstudio)

data("AirPassengers")
df <- ts_to_prophet(AirPassengers)

h <- 12 
start <- nrow(df) -  3 * 6 
s <- seq(from = 144, by = -3, length.out = 6)

future::plan(future::multiprocess, workers = 6) 
prophet_backtesting_parallel <- future.apply::future_lapply(rev(s), function(i){

  df_sub <- df[1:i,]

  train <- df_sub[1:(i - h), ]
  test <- df_sub[(i - h + 1):i, ]

  md <- prophet(df = train,  
                yearly.seasonality = 6,
                weekly.seasonality = FALSE, 
                daily.seasonality = FALSE)
  future <- make_future_dataframe(md, periods = 12)
  fc <- predict(md, future)

  return(fc)
})
RamiKrispin commented 4 years ago

Yes, as I mentioned in origin example above it does run when removing the prophet:: from the prophet function.

bletham commented 4 years ago

Seems like there might be something related to #285 going on, but otherwise I'm not able to reproduce locally and I'm not exactly sure what might be happening. Maybe it has to do with the process fork happening for multiprocessing.