Error running robyn_outputs()

tgtod002 commented 2 years ago

Project Robyn 3.6

Getting this error message when running Robyn(outputs)

Using robyn object location: C:/Users todor/OneDrive - Vericast/Documents/AFW_MMM_Model Provided 'plot_folder' doesn't exist. Using default 'plot_folder = getwd()': C:/Users/ttodor/OneDrive - Vericast/Documents/AFW_MMM_Model

Running Pareto calculations for 10000 models on 3 fronts... Error in robyn_pareto(InputCollect, OutputModels, pareto_fronts, calibration_constraint) : object 'dt_expoCurvePlot' not found

Provide dummy data & model configuration

Issues are often related to custom input data that is difficult to debug without. If necessary, please modify your data to mask real values and share a dataset that is able to reproduce the issue. Please also share your model configuration.

Environment & Robyn version

R version (R --version) Please make sure you're using the latest Robyn version

gufengzhou commented 2 years ago

Hi, have you installed/updated to the latest Robyn v3.6.0?

tgtod002 commented 2 years ago

Yes. I have.

Jozephz commented 2 years ago

I'm also getting a very similar error message when running the model with robyn_run():

Running Pareto calculations for 10000 models on 3 fronts... Error in robyn_pareto(InputCollect, OutputModels, pareto_fronts, calibration_constraint) : object 'dt_expoCurvePlot' not found In addition: Warning message: In check_legacy_input(InputCollect, cores, iterations, trials, intercept_sign, : Using legacy InputCollect values. Please set "iterations", "trials" within robyn_run() instead

This message began appearing yesterday, after a suggested update. I think it's possible a dependency may have been broken here, but I'm not sure how to resolve this. Thanks for your help :)

F1nalFortune commented 2 years ago

Also receiving the same error here. Prior to this error, am receiving a convergence plot error:

> OutputModels$convergence$moo_distrb_plot
Error in if (!(lo <- min(hi, IQR(x)/1.34))) (lo <- hi) || (lo <- abs(x[1L])) ||  : 
  missing value where TRUE/FALSE needed

laresbernardo commented 2 years ago

@F1nalFortune this seems like another issue. Reading this ticket in ggridges repo, it seems like an Inf issue. Can you please check your data for missing values, too many zeroes or NAs?

tgtod002 commented 2 years ago

This is the new error I am getting. Also, no plots/files were outputed. Using robyn object location: C:/Users todor/OneDrive - Vericast/Documents/AFW_MMM_Model Provided 'plot_folder' doesn't exist. Using default 'plot_folder = getwd()': C:/Users/ttodor/OneDrive - Vericast/Documents/AFW_MMM_Model

Error in RNGseq(n, seed, ..., version = if (checkRNGversion("1.4") >= : NMF::createStream - invalid value for 'n' [positive value expected] In addition: Warning message: In min(coef) : no non-missing arguments to min; returning Inf

tgtod002 commented 2 years ago

I re-ran robyn_run and put output = false. Then re-ran robyn_outputs. I am back to original error message which is this: Using robyn object location: C:/Users todor/OneDrive - Vericast/Documents/AFW_MMM_Model Provided 'plot_folder' doesn't exist. Using default 'plot_folder = getwd()': C:/Users/ttodor/OneDrive - Vericast/Documents/AFW_MMM_Model

Running Pareto calculations for 10000 models on 3 fronts... Error in robyn_pareto(InputCollect, OutputModels, pareto_fronts, calibration_constraint) : object 'dt_expoCurvePlot' not found

gufengzhou commented 2 years ago

The dt_expoCurvePlot not found bug should be fixed now. Please update and test the package. 9b77b998aae7897356302e5d3ae259d482c2f038

tgtod002 commented 2 years ago

@gufengzhou . I re-installed Robyn and re-ran the code. I am still getting the same error message

Using robyn object location: C:/Users todor/OneDrive - Vericast/Documents/AFW_MMM_Model Provided 'plot_folder' doesn't exist. Using default 'plot_folder = getwd()': C:/Users/ttodor/OneDrive - Vericast/Documents/AFW_MMM_Model

Running Pareto calculations for 10000 models on 3 fronts... Error in robyn_pareto(InputCollect, OutputModels, pareto_fronts, calibration_constraint) : object 'dt_expoCurvePlot' not found

laresbernardo commented 2 years ago

@tgtod002 If you please share with me (laresbernardo @gmail.com) your script and anonymized data so I can replicate it, it would def help me debug this issue. Before that, make sure you are really running on the latest version in a fresh new session to see if the error persists.

tgtod002 commented 2 years ago

Yes. running this version packageVersion("Robyn") [1] ‘3.6.0’ . Will send it to you shortly.

FDwangchao commented 2 years ago

@laresbernardo @tgtod002 在pareto.R script 中，line 269，代码块的最后，添加如下代码后，重新install Robyn，就可以跑通。没有这个error发生。猜测原因是：InputCollect中的paid_media_vars 和 exposure_vars 是一样的，所以不会执行代码里面的内容，从而导致代码块外部没定义这个对象，引发报错。

laresbernardo commented 2 years ago

Hi, @FDwangchao thanks for your suggestion. Can you please update to the most recent version and check that the fix Gufeng deployed also works for you? He added dt_expoCurvePlot <- NULL instead, which should be enough to fix this problem. Thanks

tgtod002 commented 2 years ago

HI Lares, the error occures in running robyn_outputs(). I am actually just following the demo

Here is my code

OutputModels <- robyn_run(
  InputCollect = InputCollect # feed in all model specification
  #, cores = NULL # default
  #, add_penalty_factor = FALSE # Untested feature. Use with caution.
  , iterations = 2000 # recommended for the dummy dataset
  , trials = 5 # recommended for the dummy dataset
  , outputs = FALSE # outputs = FALSE disables direct model output
)

OutputCollect <- robyn_outputs(
  InputCollect, OutputModels
  , pareto_fronts = 3
  , csv_out = "all" # "pareto" or "all"
  , clusters = TRUE # Set to TRUE to cluster similar models by ROAS. See ?robyn_clusters
  , plot_pareto = FALSE # Set to FALSE to deactivate plotting and saving model one-pagers
  , plot_folder = robyn_object # path for plots export
)

laresbernardo commented 2 years ago

Sorry @tgtod002 but I can't seem to replicate the issue with the demo.R file and your code:

> OutputModels <- robyn_run(
+   InputCollect = InputCollect # feed in all model specification
+   #, cores = NULL # default
+   #, add_penalty_factor = FALSE # Untested feature. Use with caution.
+   , iterations = 2000 # recommended for the dummy dataset
+   , trials = 5 # recommended for the dummy dataset
+   , outputs = FALSE # outputs = FALSE disables direct model output
+ )
Input data has 208 weeks in total: 2015-11-23 to 2019-11-11
Initial model is built on rolling window of 92 week: 2016-11-21 to 2018-08-20
Using geometric adstocking with 19 hyperparameters (19 to iterate + 0 fixed) on 16 cores
>>> Starting 5 trials with 2000 iterations each using TwoPointsDE nevergrad algorithm...
  Running trial 1 of 5
  |==========================================================================================================| 100%
  Finished in 1.65 mins
  Running trial 2 of 5
  |==========================================================================================================| 100%
  Finished in 1.68 mins
  Running trial 3 of 5
  |==========================================================================================================| 100%
  Finished in 1.67 mins
  Running trial 4 of 5
  |==========================================================================================================| 100%
  Finished in 1.62 mins
  Running trial 5 of 5
  |==========================================================================================================| 100%
  Finished in 1.45 mins
- DECOMP.RSSD NOT converged: sd@qt.20 0.068 >= 0.059 & med@qt.20 0.15 < 0.22 med@qt.1-3*sd
- NRMSE converged: sd@qt.20 0.0047 < 0.046 & med@qt.20 0.059 < 0.089 med@qt.1-3*sd
Total run time: 8.16 mins

> OutputCollect <- robyn_outputs(
+   InputCollect, OutputModels
+   , pareto_fronts = 3
+   , csv_out = "all" # "pareto" or "all"
+   , clusters = TRUE # Set to TRUE to cluster similar models by ROAS. See ?robyn_clusters
+   , plot_pareto = FALSE # Set to FALSE to deactivate plotting and saving model one-pagers
+   , plot_folder = robyn_object # path for plots export
+ )
Using robyn object location: /Users/bernardolares/Desktop
>>> Running Pareto calculations for 10000 models on 3 fronts...
>>> Collecting 154 pareto-optimum results into: /Users/bernardolares/Desktop/2022-02-28 14.04 init/
>> Exporting all results as CSVs into directory...
>> Exporting general plots into directory...
>>> Calculating clusters for model selection using Pareto fronts...
>> Auto selected k = 6 (clusters) based on minimum WSS variance of 5%

What hyperparameters are you using? Are you literally running the demo or did you adapt those hyperparameters to be used with your own data?

Please, update to the latest version, and DO make sure you're using the latest version in a fresh new session.

tgtod002 commented 2 years ago

Hi Lares,

Just making sure....are you using my sample file? I only have 105 weeks of data.

laresbernardo commented 2 years ago

Yes. But you have to adapt the hyper parameters to be useful with your data. I’d need that to replicate.

On 28 Feb 2022, at 2:21 PM, tgtod002 @.***> wrote:

Hi Lares,

Just making sure....are you using my sample file? I only have 105 weeks of data.

— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you were assigned.

tgtod002 commented 2 years ago

InputCollect <- robyn_inputs(
  dt_input = afw_data1
  ,dt_holidays = dt_prophet_holidays

  ### set variables

  ,date_var = "new_date"                               # date format must be "2020-01-01"
  ,dep_var = "SALES.AMT"                              # there should be only one dependent variable
  ,dep_var_type = "revenue"                         # "revenue" or "conversion"

  ,prophet_vars = c("trend", "season", "holiday") #               
  ,prophet_signs = c("default", "default", "default")       # c("default", "positive", and "negative").
  ,prophet_country = "US"                           # only one country allowed once. Including national s

  ,paid_media_vars = c("CrossDeviceDisplayimpressions",  "Cross_device_impressions", "DirectMailMediaimpressions", "DynamicMobileimpressions"   )
  ,paid_media_signs = c("positive", "positive", "positive", "positive")            #   
  ,paid_media_spends = c("CrossDevideDisplay_media_cost",  "Cross_dev_video_cost", "DirectMailInsert_media_cost", "DynamicMobilemedia_cost"     )         
  ### set model parameters

  ## set cores for parallel computing
  ,cores = 6 

  ## set rolling window start
  ,window_start = "2020-01-04"
  ,window_end = "2021-12-27"

  ## set model core features
  ,adstock = "geometric"            
  ,iterations = 2000  
  ,intercept_sign = "non_negative" # intercept_sign input must be any of: non_negative, unconstrained
  ,nevergrad_algo = "TwoPointsDE" 
  ,trials = 5 # number of allowed trials. 5 is recommended without calibration,

)

## 3. Hyperparameter interpretation & recommendation:

hyper_names(adstock = InputCollect$adstock, all_media = InputCollect$all_media)

hyperparameters <- list(

  Cross_dev_video_cost_alphas = c(0.5, 3)      
  ,Cross_dev_video_cost_gammas = c(0.3, 1)      
  ,Cross_dev_video_cost_thetas =  c(0, 0.3)

  ,CrossDevideDisplay_media_cost_alphas = c(0.5, 3) 
  ,CrossDevideDisplay_media_cost_gammas = c(0.3, 1)  
  ,CrossDevideDisplay_media_cost_thetas =  c(0, 0.3)

  ,DirectMailInsert_media_cost_alphas = c(0.5, 3)
  ,DirectMailInsert_media_cost_gammas = c(0.3, 1)
  ,DirectMailInsert_media_cost_thetas = c(0.1, 0.4)

  ,DynamicMobilemedia_cost_alphas = c(0.5, 3) 
  ,DynamicMobilemedia_cost_gammas =  c(0.3, 1)  
  , DynamicMobilemedia_cost_thetas   = c(0, 0.3)  
 )

#### 2a-3: Third, add hyperparameters into robyn_inputs()

InputCollect <- robyn_inputs(InputCollect = InputCollect, hyperparameters = hyperparameters)

laresbernardo commented 2 years ago

Ok, thanks for sharing. A couple of comments:

Your data contains a lot of continuous 0s. When checking the data you've got loads of zeroes in the first sets of weeks. Please change to a narrower window and/or clean your data. That is def a problem. Notice the warnings printed:

Recommendations are: 
1. increase hyperparameter ranges for 0-coef channels to give Robyn more freedom
2. split media into sub-channels, and/or aggregate similar channels, and/or introduce other media
3. increase trials to get more samples

You are not using Measure_1__Visits_ data anywhere. Is that intentional?
Notice that 'cores', 'iterations', 'trials', 'intercept_sign', 'nevergrad_algo' should be set in robyn_run(), not robyn_inputs().
Regarding the "Michaelis-Menten fitting for X out of range. Using lm instead", check this answer.
Given your particular input dataset noticed a small bug when calculating convergence when Inf values which I've just fixed.
Lastly, I'm sorry but with your code and your inputs I wasn't able to reproduce the mentioned (or any) error. Please, DO make sure you're running on the latest version. I'm attaching how I wanted you to kindly share your script so I can replicate the error (simply pasted it all together). Try running it for me (with updated Robyn version) and see if you get any errors: tgtod002-318.R.zip.

tgtod002 commented 2 years ago

Thanks Lares. What I am trying to do is run the program with some data as a test. So, my data is still WIP. Given that, I noticed that I get an error running robyn_outputs when I set outputs = TRUE in robyn_run The following is the error: _Using robyn object location: C:/Users todor/OneDrive - Vericast/Documents/AFW_MMM_Model Provided 'plot_folder' doesn't exist. Using default 'plot_folder = getwd()': C:/Users/ttodor/OneDrive - Vericast/Documents/AFW_MMM_Model

Running Pareto calculations for 10000 models on 3 fronts... Error in eval(jsub, SDenv, parent.frame()) : object 'iterNG' not found_

If I set outputs = FALSE in robyn_run, robyn_outputs works.

I want to be able to look at the the one pagers, which is created when you set outputs = TRUE.

laresbernardo commented 2 years ago

Yeah, makes sense.

I notice you are trying to save results in C:/Users todor/... and you missed a "/" there.
Ideally, you'd run first robyn_run(..., outputs = FALSE) and then robyn_outputs(...) to export results. That way you don't have to re-run simulations to select the outputs you'd like.
Can you share your head(OutputModels$trial1$resultCollect$resultHypParam) (given OutputModels as robyn_run(..., outputs = FALSE) output)?
Having outputs = TRUE is exactly the same as running robyn_outputs() so you'll get the one-pagers

laresbernardo commented 2 years ago

Feel free to re-open if this particular error or issue persists when updating to 3.6.2 @tgtod002

FDwangchao commented 2 years ago

这是来自QQ邮箱的假期自动回复邮件。您好，您的邮件我已收到，我稍后会给您回复的~~

facebookexperimental / Robyn