simpleError in names(hyper_list_all)

AdQ-MiguelMolina commented 2 months ago

Project Robyn

Describe issue

Hello Robyn developers.

I am currently facing this problem that has been faced before by other developers but using R. I read it was something about how the input was written and they fixed it one by updating RStudio and another one by refactoring the input in a more similar way to the demo. I am using Python and I cant find where is the problem.

Another thing that I see is that my final description is in spanish (my pc is in that language, but i don't understand why would this be anything important)

Provide reproducible example

I attach the Robyn log file, aswell as my inputs and the function I use to run all the Robyn steps (they are the same as the demo, but encapsulated in a function, so I just define everything before and run all together).

Environment & Robyn version

I am using the lates Robyn version, but not the latest R version.

I updated it today (the day I publish this, 8/4/2024) and i am using R 4.3.2. But this shouldnt be a problem since I run the demo and it runs fine.

Error and code

Robyn Log

Install nevergrad to proceed
Running plumber API at http://127.0.0.1:9999
Running swagger Docs at http://127.0.0.1:9999/__docs__/
Input 'window_start' is smaller than the earliest date in input data. It's automatically set to the earliest date: 2023-07-01
Input 'window_end' is larger than the latest date in input data. It's automatically set to the latest date: 2024-04-01
>> Running feature engineering...
Warning in .font_global(font, quiet = FALSE) :
  Font 'Arial Narrow' is not installed, has other name, or can't be found
NOTE: potential improvement on splitting channels for better exposure fitting. Threshold (Minimum R2) = 0.8 
  Check: InputCollect$modNLS$plots outputs
  Weak relationship for: "impressions_DISC_sum" and their spend
>>> One Hot Encoding applied to 4 variables: "campaign_code_combination", "strategy_code_combination", "market_code_combination", "product_code_combination"
Input data has 276 days in total: 2023-07-01 to 2024-04-01
Initial model is built on rolling window of 276 day: 2023-07-01 to 2024-04-01
Time-series validation with train_size range of 50%-80% of the data...
<simpleError in names(hyper_list_all) <- `*vtmp*`: el atributo 'names' [16] debe tener la misma longitud que el vector [15]>

My input:

modelParams = {
    "date_var": "date", # date format must be "2020-01-01"
    "dep_var": "revenue_sum", # there should be only one dependent variable
    "dep_var_type": "revenue", # "revenue" (ROI) or "conversion" (CPA)
    "prophet_vars": ["trend", "season", "holiday"], # "trend","season", "weekday" & "holiday"
    "prophet_country": "DE", # input country code. Check: dt_prophet_holidays
    "context_vars" : ["campaign_code_combination", "strategy_code_combination","market_code_combination","product_code_combination"], # e.g. competitors, discount, unemployment etc
    "paid_media_spends": ['inversions_SEM_sum','inversions_PRG_sum','inversions_DISC_sum','inversions_FBK_sum','inversions_YTB_sum'], # mandatory input
    "paid_media_vars": ['impressions_SEM_sum','impressions_PRG_sum','impressions_DISC_sum','impressions_FBK_sum','impressions_YTB_sum'], # mandatory.
    # paid_media_vars must have same order as paid_media_spends. Use media exposure metrics like
    # impressions, GRP etc. If not applicable, use spend instead.
    #"organic_vars" : "newsletter", # marketing activity without media spend
    "factor_vars" : ["campaign_code_combination", "strategy_code_combination","market_code_combination","product_code_combination"], # force variables in context_vars or organic_vars to be categorical
    "window_start": "2023-01-01", #Seleccionamos 2016 porque 2015 tenía muy pocos valores
    "window_end": "2024-12-31", #Podemos o no incluir 2019, 3 años deberían ser suficientes
    "adstock": "weibull_pdf" # geometric, weibull_cdf or weibull_pdf.
}
hyperParamsRanges = {
    "hyperparameters" : {
        "inversions_SEM_sum_alphas" : [0.15, 3.5],
        "inversions_SEM_sum_gammas" : [0.15, 0.9],
        "inversions_SEM_sum_shapes" : [0,15, 3.5],
        "inversions_SEM_sum_scales" : [0.15, 0.9],
        #
        "inversions_PRG_sum_alphas" : [0.15, 3.5],
        "inversions_PRG_sum_gammas" : [0.15, 0.9],
        "inversions_PRG_sum_shapes" : [0.15, 3.5],
        "inversions_PRG_sum_scales" : [0.15, 0.9],
        #
        "inversions_DISC_sum_alphas" : [0.15, 3.5],
        "inversions_DISC_sum_gammas" : [0.15, 0.9],
        "inversions_DISC_sum_shapes" : [0.15, 3.5],
        "inversions_DISC_sum_scales" : [0.15, 0.9],
        #
        "inversions_FBK_sum_alphas" : [0.15, 3.5],
        "inversions_FBK_sum_gammas" : [0.15, 0.9],
        "inversions_FBK_sum_shapes" : [0.15, 3.5],
        "inversions_FBK_sum_scales" : [0.15, 0.9],
        #
        "inversions_YTB_sum_alphas" : [0.15, 3.5],
        "inversions_YTB_sum_gammas" : [0.15, 0.9],
        "inversions_YTB_sum_shapes" : [0.15, 3.5],
        "inversions_YTB_sum_scales" : [0.15, 0.9],
        #
        "train_size": [0.5, 0.8]
        }
    }
iterationParams = {
    "iterations" : 3000, # NULL defaults to (max available - 1)
    "trials" : 5, # 5 recommended for the dummy dataset
    "ts_validation" : True,  # 3-way-split time series for NRMSE validation.
    "add_penalty_factor" : False, # Experimental feature. Use with caution.
}
paretoParams = {
    "pareto_fronts" : 'auto', # automatically pick how many pareto-fronts to fill min_candidates (100)
#     "min_candidates" : 100, # top pareto models for clustering. Default to 100
#     "calibration_constraint" : 0.1, # range [0.01, 0.1] & default at 0.1
    "csv_out" : "all", # "pareto", "all", or NULL (for none)
    "clusters" : True, # Set to TRUE to cluster similar models by ROAS.
    "export" : True, # this will create files locally
    "plot_folder" : os.getcwd(), # path for plots exports and files creation
    "plot_pareto" : True # Set to FALSE to deactivate plotting and saving model one-pagers
}

preSelectResults = allStepsPreSelectTogether(
    SPG[
    ["campaign_code_combination", "strategy_code_combination","market_code_combination","product_code_combination"]
    +['inversions_SEM_sum','inversions_PRG_sum','inversions_DISC_sum','inversions_FBK_sum','inversions_YTB_sum']
    +['impressions_SEM_sum','impressions_PRG_sum','impressions_DISC_sum','impressions_FBK_sum','impressions_YTB_sum']
    +['date','revenue_sum']
    ],
    pHelper.pandas_builder(pHelper.robyn_api('dt_prophet_holidays')),
    modelParams,
    hyperParamsRanges,
    iterationParams,
    paretoParams
)

I get the problem after the spend exposure render step.

Thanks in advanced.

yu-ya-tanaka commented 2 months ago

Hi, This looks similar to the following issue. https://github.com/facebookexperimental/Robyn/issues/705

What version Robyn and R are you using? Do you have the same issue when using geometric adstock or ts_validation = False?

AdQ-MiguelMolina commented 2 months ago

Hello and thanks for replying.

Yes I saw that issue before. They commented that they solved it updating de RStudio version, but I am working on Jupyter Notebook and the API setup that are in the demo.

If I use the geometric adstock I don't get any problem, but using ts_validation=False with Weibull_pdf adstock I keep getting the same error.

I don't know what could be the problem, since I don't use RStudio, and the python input is well written, and if I there were any problem with the input text, another error would happen specifying that it is not well written.

Thanks again for the help.

gufengzhou commented 2 months ago

when you use ts_validation=False, you shouldn't add "train_size": [0.5, 0.8] to the list hyperParamsRanges anymore. Does it work if you remove it?

AdQ-MiguelMolina commented 2 months ago

I tried back then, but I had probably done something wrong, since this time it worked fine. Thanks for the help.

facebookexperimental / Robyn