facebookexperimental / Robyn

Robyn is an experimental, AI/ML-powered and open sourced Marketing Mix Modeling (MMM) package from Meta Marketing Science. Our mission is to democratise modeling knowledge, inspire the industry through innovation, reduce human bias in the modeling process & build a strong open source marketing science community.
https://facebookexperimental.github.io/Robyn/
MIT License
1.07k stars 322 forks source link

simpleError in names(hyper_list_all) #952

Closed AdQ-MiguelMolina closed 2 months ago

AdQ-MiguelMolina commented 2 months ago

Project Robyn

Describe issue

Hello Robyn developers.

I am currently facing this problem that has been faced before by other developers but using R. I read it was something about how the input was written and they fixed it one by updating RStudio and another one by refactoring the input in a more similar way to the demo. I am using Python and I cant find where is the problem.

Another thing that I see is that my final description is in spanish (my pc is in that language, but i don't understand why would this be anything important)

Provide reproducible example

I attach the Robyn log file, aswell as my inputs and the function I use to run all the Robyn steps (they are the same as the demo, but encapsulated in a function, so I just define everything before and run all together).

Environment & Robyn version

I am using the lates Robyn version, but not the latest R version.

I updated it today (the day I publish this, 8/4/2024) and i am using R 4.3.2. But this shouldnt be a problem since I run the demo and it runs fine.

Error and code

Robyn Log

Install nevergrad to proceed
Running plumber API at http://127.0.0.1:9999
Running swagger Docs at http://127.0.0.1:9999/__docs__/
Input 'window_start' is smaller than the earliest date in input data. It's automatically set to the earliest date: 2023-07-01
Input 'window_end' is larger than the latest date in input data. It's automatically set to the latest date: 2024-04-01
>> Running feature engineering...
Warning in .font_global(font, quiet = FALSE) :
  Font 'Arial Narrow' is not installed, has other name, or can't be found
NOTE: potential improvement on splitting channels for better exposure fitting. Threshold (Minimum R2) = 0.8 
  Check: InputCollect$modNLS$plots outputs
  Weak relationship for: "impressions_DISC_sum" and their spend
>>> One Hot Encoding applied to 4 variables: "campaign_code_combination", "strategy_code_combination", "market_code_combination", "product_code_combination"
Input data has 276 days in total: 2023-07-01 to 2024-04-01
Initial model is built on rolling window of 276 day: 2023-07-01 to 2024-04-01
Time-series validation with train_size range of 50%-80% of the data...
<simpleError in names(hyper_list_all) <- `*vtmp*`: el atributo 'names' [16] debe tener la misma longitud que el vector [15]>

My input:

modelParams = {
    "date_var": "date", # date format must be "2020-01-01"
    "dep_var": "revenue_sum", # there should be only one dependent variable
    "dep_var_type": "revenue", # "revenue" (ROI) or "conversion" (CPA)
    "prophet_vars": ["trend", "season", "holiday"], # "trend","season", "weekday" & "holiday"
    "prophet_country": "DE", # input country code. Check: dt_prophet_holidays
    "context_vars" : ["campaign_code_combination", "strategy_code_combination","market_code_combination","product_code_combination"], # e.g. competitors, discount, unemployment etc
    "paid_media_spends": ['inversions_SEM_sum','inversions_PRG_sum','inversions_DISC_sum','inversions_FBK_sum','inversions_YTB_sum'], # mandatory input
    "paid_media_vars": ['impressions_SEM_sum','impressions_PRG_sum','impressions_DISC_sum','impressions_FBK_sum','impressions_YTB_sum'], # mandatory.
    # paid_media_vars must have same order as paid_media_spends. Use media exposure metrics like
    # impressions, GRP etc. If not applicable, use spend instead.
    #"organic_vars" : "newsletter", # marketing activity without media spend
    "factor_vars" : ["campaign_code_combination", "strategy_code_combination","market_code_combination","product_code_combination"], # force variables in context_vars or organic_vars to be categorical
    "window_start": "2023-01-01", #Seleccionamos 2016 porque 2015 tenía muy pocos valores
    "window_end": "2024-12-31", #Podemos o no incluir 2019, 3 años deberían ser suficientes
    "adstock": "weibull_pdf" # geometric, weibull_cdf or weibull_pdf.
}
hyperParamsRanges = {
    "hyperparameters" : {
        "inversions_SEM_sum_alphas" : [0.15, 3.5],
        "inversions_SEM_sum_gammas" : [0.15, 0.9],
        "inversions_SEM_sum_shapes" : [0,15, 3.5],
        "inversions_SEM_sum_scales" : [0.15, 0.9],
        #
        "inversions_PRG_sum_alphas" : [0.15, 3.5],
        "inversions_PRG_sum_gammas" : [0.15, 0.9],
        "inversions_PRG_sum_shapes" : [0.15, 3.5],
        "inversions_PRG_sum_scales" : [0.15, 0.9],
        #
        "inversions_DISC_sum_alphas" : [0.15, 3.5],
        "inversions_DISC_sum_gammas" : [0.15, 0.9],
        "inversions_DISC_sum_shapes" : [0.15, 3.5],
        "inversions_DISC_sum_scales" : [0.15, 0.9],
        #
        "inversions_FBK_sum_alphas" : [0.15, 3.5],
        "inversions_FBK_sum_gammas" : [0.15, 0.9],
        "inversions_FBK_sum_shapes" : [0.15, 3.5],
        "inversions_FBK_sum_scales" : [0.15, 0.9],
        #
        "inversions_YTB_sum_alphas" : [0.15, 3.5],
        "inversions_YTB_sum_gammas" : [0.15, 0.9],
        "inversions_YTB_sum_shapes" : [0.15, 3.5],
        "inversions_YTB_sum_scales" : [0.15, 0.9],
        #
        "train_size": [0.5, 0.8]
        }
    }
iterationParams = {
    "iterations" : 3000, # NULL defaults to (max available - 1)
    "trials" : 5, # 5 recommended for the dummy dataset
    "ts_validation" : True,  # 3-way-split time series for NRMSE validation.
    "add_penalty_factor" : False, # Experimental feature. Use with caution.
}
paretoParams = {
    "pareto_fronts" : 'auto', # automatically pick how many pareto-fronts to fill min_candidates (100)
#     "min_candidates" : 100, # top pareto models for clustering. Default to 100
#     "calibration_constraint" : 0.1, # range [0.01, 0.1] & default at 0.1
    "csv_out" : "all", # "pareto", "all", or NULL (for none)
    "clusters" : True, # Set to TRUE to cluster similar models by ROAS.
    "export" : True, # this will create files locally
    "plot_folder" : os.getcwd(), # path for plots exports and files creation
    "plot_pareto" : True # Set to FALSE to deactivate plotting and saving model one-pagers
}

preSelectResults = allStepsPreSelectTogether(
    SPG[
    ["campaign_code_combination", "strategy_code_combination","market_code_combination","product_code_combination"]
    +['inversions_SEM_sum','inversions_PRG_sum','inversions_DISC_sum','inversions_FBK_sum','inversions_YTB_sum']
    +['impressions_SEM_sum','impressions_PRG_sum','impressions_DISC_sum','impressions_FBK_sum','impressions_YTB_sum']
    +['date','revenue_sum']
    ],
    pHelper.pandas_builder(pHelper.robyn_api('dt_prophet_holidays')),
    modelParams,
    hyperParamsRanges,
    iterationParams,
    paretoParams
)

I get the problem after the spend exposure render step.

Thanks in advanced.

yu-ya-tanaka commented 2 months ago

Hi, This looks similar to the following issue. https://github.com/facebookexperimental/Robyn/issues/705

What version Robyn and R are you using? Do you have the same issue when using geometric adstock or ts_validation = False?

AdQ-MiguelMolina commented 2 months ago

Hello and thanks for replying.

Yes I saw that issue before. They commented that they solved it updating de RStudio version, but I am working on Jupyter Notebook and the API setup that are in the demo.

If I use the geometric adstock I don't get any problem, but using ts_validation=False with Weibull_pdf adstock I keep getting the same error.

I don't know what could be the problem, since I don't use RStudio, and the python input is well written, and if I there were any problem with the input text, another error would happen specifying that it is not well written.

Thanks again for the help.

gufengzhou commented 2 months ago

when you use ts_validation=False, you shouldn't add "train_size": [0.5, 0.8] to the list hyperParamsRanges anymore. Does it work if you remove it?

AdQ-MiguelMolina commented 2 months ago

I tried back then, but I had probably done something wrong, since this time it worked fine. Thanks for the help.