klarsen1 / MarketMatching

Other
131 stars 37 forks source link

Number of controls is limited to five #15

Closed mc51 closed 5 years ago

mc51 commented 5 years ago

/e: .................................... Inspecting the source-code I've found the control_matches parameter for the inference() function. This resolves my issue. However, this is not documented anywhere. Hence, I've spent way too much time resolving this :/
..........................................

I've been trying to add more than 5 controls using MarketMatching. Setting matches = 8 in best_matches() correctly returns 8 controls for each test . However, the inference() function seems to be always only using 5 controls at most. I have also experienced this behavior with other data-sets, which have hundreds of potential controls. Also, I made sure that this is not an issue of the underlying CausalImpact package or other required / implicitly loaded modules (BoomSpikeSlab, bsts). The following code reproduces the issue. Notice, that adjusting matches to a value below 5 has the proper effect:


data(weather, package="MarketMatching")
mm <- best_matches(data=weather,
                   id_variable="Area",
                   date_variable="Date",
                   matching_variable="Mean_TemperatureF",
                   parallel=FALSE,
                   warping_limit=1, # warping limit=1
                   dtw_emphasis=1, # rely only on dtw for pre-screening
                   matches=8,    # Values below 5 have an effect on inference, but not above!?
                   start_match_period="2014-01-01",
                   end_match_period="2014-10-01")

results <- MarketMatching::inference(matched_markets = mm,
                                    test_market = "CPH",
                                    end_post_period = "2015-10-01")

head(results$CausalImpactObject$model$bsts.model$coefficient)

I'm working with this environment:
> 
> R version 3.5.1 (2018-07-02)
> Platform: x86_64-w64-mingw32/x64 (64-bit)
> Running under: Windows 10 x64 (build 14393)
> 
> Matrix products: default
> 
> locale:
> [1] LC_COLLATE=German_Germany.1252  LC_CTYPE=German_Germany.1252   
> [3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C                   
> [5] LC_TIME=German_Germany.1252    
> 
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base     
> 
> other attached packages:
>  [1] MarketMatching_1.1.1 bindrcpp_0.2.2       forcats_0.3.0       
>  [4] stringr_1.3.1        dplyr_0.7.6          purrr_0.2.5         
>  [7] readr_1.1.1          tidyr_0.8.1          tibble_1.4.2        
> [10] ggplot2_3.0.0        tidyverse_1.2.1      CausalImpact_1.2.4
> [13] bsts_0.8.0           xts_0.11-1           zoo_1.8-4           
> [16] BoomSpikeSlab_1.0.0  Boom_0.8             MASS_7.3-50         
> [19] RevoUtils_11.0.1     RevoUtilsMath_11.0.0
> 
> loaded via a namespace (and not attached):
>  [1] Rcpp_0.12.18      lubridate_1.7.4   lattice_0.20-35   foreach_1.4.4    
>  [5] assertthat_0.2.0  digest_0.6.15     IRdisplay_0.5.0   R6_2.2.2         
>  [9] cellranger_1.1.0  plyr_1.8.4        repr_0.15.0       backports_1.1.2  
> [13] evaluate_0.11     httr_1.3.1        pillar_1.3.0      rlang_0.2.1      
> [17] lazyeval_0.2.1    uuid_0.1-2        readxl_1.1.0      data.table_1.12.0
> [21] rstudioapi_0.8    labeling_0.3      munsell_0.5.0     proxy_0.4-22     
> [25] broom_0.5.0       compiler_3.5.1    modelr_0.1.2      pkgconfig_2.0.1  
> [29] base64enc_0.1-3   htmltools_0.3.6   tidyselect_0.2.4  codetools_0.2-16 
> [33] dtw_1.20-1        crayon_1.3.4      withr_2.1.2       grid_3.5.1       
> [37] nlme_3.1-137      jsonlite_1.5      gtable_0.2.0      magrittr_1.5     
> [41] scales_1.0.0      cli_1.0.0         stringi_1.2.4     reshape2_1.4.3   
> [45] doParallel_1.0.14 xml2_1.2.0        IRkernel_0.8.12   iterators_1.0.10 
> [49] tools_3.5.1       glue_1.3.0        hms_0.4.2         parallel_3.5.1   
> [53] colorspace_1.3-2  rvest_0.3.2       pbdZMQ_0.3-3      bindr_0.1.1      
> [57] haven_1.1.2