facebookexperimental / Robyn

Robyn is an experimental, AI/ML-powered and open sourced Marketing Mix Modeling (MMM) package from Meta Marketing Science. Our mission is to democratise modeling knowledge, inspire the industry through innovation, reduce human bias in the modeling process & build a strong open source marketing science community.
https://facebookexperimental.github.io/Robyn/
MIT License
1.16k stars 346 forks source link

Error running robyn_outputs() #318

Closed tgtod002 closed 2 years ago

tgtod002 commented 2 years ago

Project Robyn 3.6

Getting this error message when running Robyn(outputs)

Using robyn object location: C:/Users todor/OneDrive - Vericast/Documents/AFW_MMM_Model Provided 'plot_folder' doesn't exist. Using default 'plot_folder = getwd()': C:/Users/ttodor/OneDrive - Vericast/Documents/AFW_MMM_Model

Running Pareto calculations for 10000 models on 3 fronts... Error in robyn_pareto(InputCollect, OutputModels, pareto_fronts, calibration_constraint) : object 'dt_expoCurvePlot' not found

Provide dummy data & model configuration

Issues are often related to custom input data that is difficult to debug without. If necessary, please modify your data to mask real values and share a dataset that is able to reproduce the issue. Please also share your model configuration.

Environment & Robyn version

R version (R --version) Please make sure you're using the latest Robyn version

gufengzhou commented 2 years ago

Hi, have you installed/updated to the latest Robyn v3.6.0?

tgtod002 commented 2 years ago

Yes. I have.

Jozephz commented 2 years ago

I'm also getting a very similar error message when running the model with robyn_run():

Running Pareto calculations for 10000 models on 3 fronts... Error in robyn_pareto(InputCollect, OutputModels, pareto_fronts, calibration_constraint) : object 'dt_expoCurvePlot' not found In addition: Warning message: In check_legacy_input(InputCollect, cores, iterations, trials, intercept_sign, : Using legacy InputCollect values. Please set "iterations", "trials" within robyn_run() instead

This message began appearing yesterday, after a suggested update. I think it's possible a dependency may have been broken here, but I'm not sure how to resolve this. Thanks for your help :)

F1nalFortune commented 2 years ago

Also receiving the same error here. Prior to this error, am receiving a convergence plot error:

> OutputModels$convergence$moo_distrb_plot
Error in if (!(lo <- min(hi, IQR(x)/1.34))) (lo <- hi) || (lo <- abs(x[1L])) ||  : 
  missing value where TRUE/FALSE needed
laresbernardo commented 2 years ago

@F1nalFortune this seems like another issue. Reading this ticket in ggridges repo, it seems like an Inf issue. Can you please check your data for missing values, too many zeroes or NAs?

tgtod002 commented 2 years ago

This is the new error I am getting. Also, no plots/files were outputed. Using robyn object location: C:/Users todor/OneDrive - Vericast/Documents/AFW_MMM_Model Provided 'plot_folder' doesn't exist. Using default 'plot_folder = getwd()': C:/Users/ttodor/OneDrive - Vericast/Documents/AFW_MMM_Model

Error in RNGseq(n, seed, ..., version = if (checkRNGversion("1.4") >= : NMF::createStream - invalid value for 'n' [positive value expected] In addition: Warning message: In min(coef) : no non-missing arguments to min; returning Inf

tgtod002 commented 2 years ago

I re-ran robyn_run and put output = false. Then re-ran robyn_outputs. I am back to original error message which is this: Using robyn object location: C:/Users todor/OneDrive - Vericast/Documents/AFW_MMM_Model Provided 'plot_folder' doesn't exist. Using default 'plot_folder = getwd()': C:/Users/ttodor/OneDrive - Vericast/Documents/AFW_MMM_Model

Running Pareto calculations for 10000 models on 3 fronts... Error in robyn_pareto(InputCollect, OutputModels, pareto_fronts, calibration_constraint) : object 'dt_expoCurvePlot' not found

gufengzhou commented 2 years ago

The dt_expoCurvePlot not found bug should be fixed now. Please update and test the package. 9b77b998aae7897356302e5d3ae259d482c2f038

tgtod002 commented 2 years ago

@gufengzhou . I re-installed Robyn and re-ran the code. I am still getting the same error message

Using robyn object location: C:/Users todor/OneDrive - Vericast/Documents/AFW_MMM_Model Provided 'plot_folder' doesn't exist. Using default 'plot_folder = getwd()': C:/Users/ttodor/OneDrive - Vericast/Documents/AFW_MMM_Model

Running Pareto calculations for 10000 models on 3 fronts... Error in robyn_pareto(InputCollect, OutputModels, pareto_fronts, calibration_constraint) : object 'dt_expoCurvePlot' not found

laresbernardo commented 2 years ago

@tgtod002 If you please share with me (laresbernardo @gmail.com) your script and anonymized data so I can replicate it, it would def help me debug this issue. Before that, make sure you are really running on the latest version in a fresh new session to see if the error persists.

tgtod002 commented 2 years ago

Yes. running this version packageVersion("Robyn") [1] ‘3.6.0’ . Will send it to you shortly.

FDwangchao commented 2 years ago

@laresbernardo @tgtod002 在pareto.R script 中,line 269,代码块的最后,添加如下代码后,重新install Robyn,就可以跑通。没有这个error发生。 image 猜测原因是:InputCollect中的paid_media_vars 和 exposure_vars 是一样的,所以不会执行代码里面的内容,从而导致代码块外部没定义这个对象,引发报错。

laresbernardo commented 2 years ago

Hi, @FDwangchao thanks for your suggestion. Can you please update to the most recent version and check that the fix Gufeng deployed also works for you? He added dt_expoCurvePlot <- NULL instead, which should be enough to fix this problem. Thanks

tgtod002 commented 2 years ago

HI Lares, the error occures in running robyn_outputs(). I am actually just following the demo

Here is my code

OutputModels <- robyn_run(
  InputCollect = InputCollect # feed in all model specification
  #, cores = NULL # default
  #, add_penalty_factor = FALSE # Untested feature. Use with caution.
  , iterations = 2000 # recommended for the dummy dataset
  , trials = 5 # recommended for the dummy dataset
  , outputs = FALSE # outputs = FALSE disables direct model output
)

OutputCollect <- robyn_outputs(
  InputCollect, OutputModels
  , pareto_fronts = 3
  , csv_out = "all" # "pareto" or "all"
  , clusters = TRUE # Set to TRUE to cluster similar models by ROAS. See ?robyn_clusters
  , plot_pareto = FALSE # Set to FALSE to deactivate plotting and saving model one-pagers
  , plot_folder = robyn_object # path for plots export
)
laresbernardo commented 2 years ago

Sorry @tgtod002 but I can't seem to replicate the issue with the demo.R file and your code:

> OutputModels <- robyn_run(
+   InputCollect = InputCollect # feed in all model specification
+   #, cores = NULL # default
+   #, add_penalty_factor = FALSE # Untested feature. Use with caution.
+   , iterations = 2000 # recommended for the dummy dataset
+   , trials = 5 # recommended for the dummy dataset
+   , outputs = FALSE # outputs = FALSE disables direct model output
+ )
Input data has 208 weeks in total: 2015-11-23 to 2019-11-11
Initial model is built on rolling window of 92 week: 2016-11-21 to 2018-08-20
Using geometric adstocking with 19 hyperparameters (19 to iterate + 0 fixed) on 16 cores
>>> Starting 5 trials with 2000 iterations each using TwoPointsDE nevergrad algorithm...
  Running trial 1 of 5
  |==========================================================================================================| 100%
  Finished in 1.65 mins
  Running trial 2 of 5
  |==========================================================================================================| 100%
  Finished in 1.68 mins
  Running trial 3 of 5
  |==========================================================================================================| 100%
  Finished in 1.67 mins
  Running trial 4 of 5
  |==========================================================================================================| 100%
  Finished in 1.62 mins
  Running trial 5 of 5
  |==========================================================================================================| 100%
  Finished in 1.45 mins
- DECOMP.RSSD NOT converged: sd@qt.20 0.068 >= 0.059 & med@qt.20 0.15 < 0.22 med@qt.1-3*sd
- NRMSE converged: sd@qt.20 0.0047 < 0.046 & med@qt.20 0.059 < 0.089 med@qt.1-3*sd
Total run time: 8.16 mins
> OutputCollect <- robyn_outputs(
+   InputCollect, OutputModels
+   , pareto_fronts = 3
+   , csv_out = "all" # "pareto" or "all"
+   , clusters = TRUE # Set to TRUE to cluster similar models by ROAS. See ?robyn_clusters
+   , plot_pareto = FALSE # Set to FALSE to deactivate plotting and saving model one-pagers
+   , plot_folder = robyn_object # path for plots export
+ )
Using robyn object location: /Users/bernardolares/Desktop
>>> Running Pareto calculations for 10000 models on 3 fronts...
>>> Collecting 154 pareto-optimum results into: /Users/bernardolares/Desktop/2022-02-28 14.04 init/
>> Exporting all results as CSVs into directory...
>> Exporting general plots into directory...
>>> Calculating clusters for model selection using Pareto fronts...
>> Auto selected k = 6 (clusters) based on minimum WSS variance of 5%

What hyperparameters are you using? Are you literally running the demo or did you adapt those hyperparameters to be used with your own data?

Please, update to the latest version, and DO make sure you're using the latest version in a fresh new session.

tgtod002 commented 2 years ago

Hi Lares,

Just making sure....are you using my sample file? I only have 105 weeks of data.

laresbernardo commented 2 years ago

Yes. But you have to adapt the hyper parameters to be useful with your data. I’d need that to replicate.

On 28 Feb 2022, at 2:21 PM, tgtod002 @.***> wrote:

 Hi Lares,

Just making sure....are you using my sample file? I only have 105 weeks of data.

— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you were assigned.

tgtod002 commented 2 years ago
InputCollect <- robyn_inputs(
  dt_input = afw_data1
  ,dt_holidays = dt_prophet_holidays

  ### set variables

  ,date_var = "new_date"                               # date format must be "2020-01-01"
  ,dep_var = "SALES.AMT"                              # there should be only one dependent variable
  ,dep_var_type = "revenue"                         # "revenue" or "conversion"

  ,prophet_vars = c("trend", "season", "holiday") #               
  ,prophet_signs = c("default", "default", "default")       # c("default", "positive", and "negative").
  ,prophet_country = "US"                           # only one country allowed once. Including national s

  ,paid_media_vars = c("CrossDeviceDisplayimpressions",  "Cross_device_impressions", "DirectMailMediaimpressions", "DynamicMobileimpressions"   )
  ,paid_media_signs = c("positive", "positive", "positive", "positive")            #   
  ,paid_media_spends = c("CrossDevideDisplay_media_cost",  "Cross_dev_video_cost", "DirectMailInsert_media_cost", "DynamicMobilemedia_cost"     )         
  ### set model parameters

  ## set cores for parallel computing
  ,cores = 6 

  ## set rolling window start
  ,window_start = "2020-01-04"
  ,window_end = "2021-12-27"

  ## set model core features
  ,adstock = "geometric"            
  ,iterations = 2000  
  ,intercept_sign = "non_negative" # intercept_sign input must be any of: non_negative, unconstrained
  ,nevergrad_algo = "TwoPointsDE" 
  ,trials = 5 # number of allowed trials. 5 is recommended without calibration,

)

## 3. Hyperparameter interpretation & recommendation:

hyper_names(adstock = InputCollect$adstock, all_media = InputCollect$all_media)

hyperparameters <- list(

  Cross_dev_video_cost_alphas = c(0.5, 3)      
  ,Cross_dev_video_cost_gammas = c(0.3, 1)      
  ,Cross_dev_video_cost_thetas =  c(0, 0.3)

  ,CrossDevideDisplay_media_cost_alphas = c(0.5, 3) 
  ,CrossDevideDisplay_media_cost_gammas = c(0.3, 1)  
  ,CrossDevideDisplay_media_cost_thetas =  c(0, 0.3)

  ,DirectMailInsert_media_cost_alphas = c(0.5, 3)
  ,DirectMailInsert_media_cost_gammas = c(0.3, 1)
  ,DirectMailInsert_media_cost_thetas = c(0.1, 0.4)

  ,DynamicMobilemedia_cost_alphas = c(0.5, 3) 
  ,DynamicMobilemedia_cost_gammas =  c(0.3, 1)  
  , DynamicMobilemedia_cost_thetas   = c(0, 0.3)  
 )

#### 2a-3: Third, add hyperparameters into robyn_inputs()

InputCollect <- robyn_inputs(InputCollect = InputCollect, hyperparameters = hyperparameters)
laresbernardo commented 2 years ago

Ok, thanks for sharing. A couple of comments:

tgtod002 commented 2 years ago

Thanks Lares. What I am trying to do is run the program with some data as a test. So, my data is still WIP. Given that, I noticed that I get an error running robyn_outputs when I set outputs = TRUE in robyn_run The following is the error: _Using robyn object location: C:/Users todor/OneDrive - Vericast/Documents/AFW_MMM_Model Provided 'plot_folder' doesn't exist. Using default 'plot_folder = getwd()': C:/Users/ttodor/OneDrive - Vericast/Documents/AFW_MMM_Model

Running Pareto calculations for 10000 models on 3 fronts... Error in eval(jsub, SDenv, parent.frame()) : object 'iterNG' not found_

If I set outputs = FALSE in robyn_run, robyn_outputs works.

I want to be able to look at the the one pagers, which is created when you set outputs = TRUE.

laresbernardo commented 2 years ago

Yeah, makes sense.

laresbernardo commented 2 years ago

Feel free to re-open if this particular error or issue persists when updating to 3.6.2 @tgtod002

FDwangchao commented 2 years ago

这是来自QQ邮箱的假期自动回复邮件。   您好,您的邮件我已收到,我稍后会给您回复的~~