facebookincubator / GeoLift

GeoLift is an end-to-end geo-experimental methodology based on Synthetic Control Methods used to measure the true incremental effect (Lift) of ad campaign.
https://facebookincubator.github.io/GeoLift/
MIT License
171 stars 53 forks source link

Subsetting error when no markets meet criteria #191

Open karawoo opened 2 months ago

karawoo commented 2 months ago

Bug description

If no suitable markets are found at line 1993, the function will print the message but then throw a separate error at line 1998:

https://github.com/facebookincubator/GeoLift/blob/9722f4f295205c0e5a1c7e434cb5ab972cf23675/R/pre_test_power.R#L1993-L1998

Session information

R version 4.4.0 (2024-04-24)
Platform: aarch64-apple-darwin20
Running under: macOS Sonoma 14.4.1

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: America/Los_Angeles
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] dplyr_1.1.4   GeoLift_2.7.4

loaded via a namespace (and not attached):
 [1] gtable_0.3.5           xfun_0.44              ggplot2_3.5.1          processx_3.8.4         lattice_0.22-6        
 [6] callr_3.7.6            ps_1.7.6               quadprog_1.5-8         vctrs_0.6.5.9000       tools_4.4.0           
[11] generics_0.1.3         augsynth_0.2.0         parallel_4.4.0         CausalImpact_1.3.0     sandwich_3.1-0        
[16] tibble_3.2.1           proxy_0.4-27           fansi_1.0.6            R.oo_1.26.0            xts_0.13.2            
[21] pkgconfig_2.0.3        MarketMatching_1.2.1   Matrix_1.7-0           rngtools_1.5.2         assertthat_0.2.1      
[26] R.cache_0.16.0         lifecycle_1.0.4        compiler_4.4.0         stringr_1.5.1          progress_1.2.3        
[31] munsell_0.5.1          codetools_0.2-20       gsynth_1.2.1           htmltools_0.5.8.1      yaml_2.3.8            
[36] Formula_1.2-5          pillar_1.9.0           crayon_1.5.2           tidyr_1.3.1            R.utils_2.12.3        
[41] panelView_1.1.17       MASS_7.3-60.2          doRNG_1.8.6            iterators_1.0.14       abind_1.4-5           
[46] foreach_1.5.2          parallelly_1.37.1      styler_1.10.3          tidyselect_1.2.1       digest_0.6.35         
[51] mvtnorm_1.2-5          stringi_1.8.4          future_1.33.2          reshape2_1.4.4         purrr_1.0.2           
[56] listenv_0.9.1          fastmap_1.2.0          grid_4.4.0             colorspace_2.1-0       cli_3.6.2             
[61] lfe_3.0-0              magrittr_2.0.3         utf8_1.2.4             clipr_0.8.0            withr_3.0.0           
[66] prettyunits_1.2.0      scales_1.3.0           rmarkdown_2.27         globals_0.16.3         gridExtra_2.3         
[71] Boom_0.9.15            R.methodsS3_1.8.2      zoo_1.8-12             hms_1.1.3              evaluate_0.23         
[76] knitr_1.47             BoomSpikeSlab_1.2.6    dtw_1.23-1             doParallel_1.0.17      rlang_1.1.3           
[81] Rcpp_1.0.12            xtable_1.8-4           glue_1.7.0             directlabels_2024.1.21 reprex_2.1.0          
[86] rstudioapi_0.16.0      R6_2.5.1               plyr_1.8.9             fs_1.6.4               bsts_0.9.10           

Reproduction steps

library("GeoLift")

GeoTestData_PreTest <- GeoDataRead(
  data = GeoLift_PreTest,
  date_id = "date",
  location_id = "location",
  Y_id = "Y",
  X = c(),
  format = "yyyy-mm-dd",
  summary = FALSE
)

MarketSelections <- GeoLiftMarketSelection(
  data = GeoTestData_PreTest,
  treatment_periods = 15,
  N = 2,
  Y_id = "Y",
  location_id = "location",
  time_id = "time",
  effect_size = seq(0, 0.5, 0.05),
  lookback_window = 1,
  include_markets = c("chicago", "portland"),
  cpic = 7.5,
  budget = NA # this is the problematic input
)
#> Setting up cluster.
#> Importing functions into cluster.
#> Attempting to load the environment 'package:dplyr'
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
#> Calculating which the best treatment groups are.
#> 
#> Deterministic setup with 2 locations in treatment.
#> 
#> Warning: No markets meet the criteria you provided. Consider modifying
#>           the input parameters
#> Error in `[.data.frame`(resultsM, , c(13, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, : undefined columns selected

Created on 2024-06-07 with reprex v2.1.0

Expected behavior

In other places where no suitable markets are found, the function errors right away. I'd expect the same here.

https://github.com/facebookincubator/GeoLift/blob/9722f4f295205c0e5a1c7e434cb5ab972cf23675/R/pre_test_power.R#L1812 https://github.com/facebookincubator/GeoLift/blob/9722f4f295205c0e5a1c7e434cb5ab972cf23675/R/pre_test_power.R#L1896

Would you be open to me submitting a PR to make line 1993 consistent with these two lines ^ ?

raphaeltamaki commented 2 months ago

Hello @karawoo Thank you for submitting this error. I will revise it during the week but if you wish to open a PR I will be more than happy to review it