allisonwiener commented 1 year ago

Project Robyn

Describe issue

When I run the Demo.R example with test data, 5% of the time the model code works and 95% of the time I get an error. It looks like some sort of test dataset non-numeric issue but I'm unable to track down exactly what is going on... any tips? I'm surprised to be the only one with this problem. The error occurs when running Step 3: Build initial model with the standard simulated test dataset.

Error in signif(nevergrad_hp_val[[co]][index], 6): non-numeric argument to mathematical function.

Environment & Robyn version

I run everything from the demo.R file up until this point using Dev Robyn 3.10.4.9000, tidyverse 1.3.2, reticulate ver 1.29, cli 3.6.1, R version: 4.1.3

Reproducible example code:

LOADING

options(timeout=400) install.packages("remotes") remotes::install_github("facebookexperimental/Robyn") library(Robyn) library(tidyverse) packageVersion("tidyverse") packageVersion("reticulate") #1.28 create_files <- TRUE

INSTALL NEVERGRAD

library("reticulate") install_miniconda() conda_create("r-reticulate") use_condaenv("r-reticulate") Sys.setenv(RETICULATE_PYTHON = "~/Library/r-miniconda-arm64/envs/r-reticulate/bin/python") py_config() # If the first path is not as 5, do 7 conda_install("r-reticulate", "numpy", pip=TRUE) conda_install("r-reticulate", "nevergrad", pip=TRUE)

Step 1: Load data

data("dt_simulated_weekly") data("dt_prophet_holidays") robyn_directory <- "~/Documents/ROutput"

Step 2a: For first time user: Model specification in 4 steps

InputCollect <- robyn_inputs( dt_input = dt_simulated_weekly, dt_holidays = dt_prophet_holidays, date_var = "DATE", # date format must be "2020-01-01" dep_var = "revenue", # there should be only one dependent variable dep_var_type = "revenue", # "revenue" (ROI) or "conversion" (CPA) prophet_vars = c("trend", "season", "holiday"), # "trend","season", "weekday" & "holiday" prophet_country = "DE", # input one country. dt_prophet_holidays includes 59 countries by default context_vars = c("competitor_sales_B", "events"), # e.g. competitors, discount, unemployment etc paid_media_spends = c("tv_S", "ooh_S", "print_S", "facebook_S", "search_S"), # mandatory input paid_media_vars = c("tv_S", "ooh_S", "print_S", "facebook_I", "search_clicks_P"), # mandatory.

paid_media_vars must have same order as paid_media_spends. Use media exposure metrics like

impressions, GRP etc. If not applicable, use spend instead.

organic_vars = "newsletter", # marketing activity without media spend

factor_vars = c("events"), # force variables in context_vars or organic_vars to be categorical

window_start = "2016-01-01", window_end = "2018-12-31", adstock = "geometric" # geometric, weibull_cdf or weibull_pdf. ) print(InputCollect)

2a-2: Second, define and add hyperparameters

hyper_names(adstock = InputCollect$adstock, all_media = InputCollect$all_media) plot_adstock(plot = FALSE) plot_saturation(plot = FALSE)

4. Set individual hyperparameter bounds.

hyper_limits() hyperparameters <- list( facebook_S_alphas = c(0.5, 3), facebook_S_gammas = c(0.3, 1), facebook_S_thetas = c(0, 0.3), print_S_alphas = c(0.5, 3), print_S_gammas = c(0.3, 1), print_S_thetas = c(0.1, 0.4), tv_S_alphas = c(0.5, 3), tv_S_gammas = c(0.3, 1), tv_S_thetas = c(0.3, 0.8), search_S_alphas = c(0.5, 3), search_S_gammas = c(0.3, 1), search_S_thetas = c(0, 0.3), ooh_S_alphas = c(0.5, 3), ooh_S_gammas = c(0.3, 1), ooh_S_thetas = c(0.1, 0.4), newsletter_alphas = c(0.5, 3), newsletter_gammas = c(0.3, 1), newsletter_thetas = c(0.1, 0.4), train_size = c(0.5, 0.8) )

2a-3: Third, add hyperparameters into robyn_inputs()

InputCollect <- robyn_inputs(InputCollect = InputCollect, hyperparameters = hyperparameters) if (length(InputCollect$exposure_vars) > 0) { lapply(InputCollect$modNLS$plots, plot) }

Step 3: Build initial model

OutputModels <- robyn_run( InputCollect = InputCollect, # feed in all model specification cores = NULL, # NULL defaults to (max available - 1) iterations = 2000, # 2000 recommended for the dummy dataset with no calibration trials = 5, # 5 recommended for the dummy dataset ts_validation = TRUE, # 3-way-split time series for NRMSE validation. add_penalty_factor = FALSE # Experimental feature. Use with caution. ) print(OutputModels)

###################################ERROR:

Input data has 208 weeks in total: 2014-11-23 to 2019-11-11 inital model is built on rolling window of 157 week: 2016-01-04 to 2018-12-31 Time-series validation with train_size range of 50-80% of the data... Using geometric adstocking with 20 hyperparameters (20 to iterate + 0 fixed) on 31 cores >>> Starting 5 trials with 2000 iterations each using TwoPointsDE nevergrad algorithm... Running trial 1 of 5 || 0% Timing stopped at: 0.003 0 0.003

Error in signif(nevergrad_hp_val[[co]][index], 6): non-numeric argument to mathematical function.

gufengzhou commented 1 year ago

This is unusual. Have you also updated all dependencies?

brunofrancesco commented 1 year ago

I had the same error!

Input data has 126 weeks in total: 2021-01-04 to 2023-05-29 Initial model is built on rolling window of 126 week: 2021-01-04 to 2023-05-29 Time-series validation with train_size range of 50%-80% of the data... Using geometric adstocking with 29 hyperparameters (29 to iterate + 0 fixed) on 1 core (Windows fallback)

Starting 5 trials with 2000 iterations each using TwoPointsDE nevergrad algorithm... Running trial 1 of 5 | | 0%Timing stopped at: 0 0 0.01 Error in signif(nevergrad_hp_val[[co]][index], 6) : non-numeric argument to mathematical function

allisonwiener commented 1 year ago

Yes, updated all dependencies with this "update.packages(checkBuilt=TRUE, ask=FALSE)". It actually ran successfully after using this the first time but has not run since.

gufengzhou commented 1 year ago

So you still have the error after the update?

Anything you've changed? I can't reproduce your error