bquistorff / synth_runner

A tool to run a pool of synthetic controls, conduct inference, and produce visualizations.
42 stars 27 forks source link

synth_runner with "nested allopt" does not finish the estimation (memory problem) #30

Closed estebancolla closed 5 years ago

estebancolla commented 5 years ago

Preliminaries

Before submitting an issue, please check (with x in brackets) that you:

Expected behavior and actual behavior

I am using synth_runner with nested and allopt (from synth), and also with the option gen_vars. The routine uses a lot of working memory: 97% of 16GB of RAM memory of my computer. Sometimes it finishes the estimation; sometimes, it collapses, giving the following result:

Result when it collapses:

Estimating the treatment effects
Estimating the possible placebo effects (one set for each of the 1 treatment periods)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 Total: 68
...............................................x.. 2.18h elapsed. 47.12m remaining
.....xxxxxxxxxxxx| 2.70h elapsed. 
op. sys. refuses to provide memory
    In the process of initializing itself, Stata's data-storage memory manager attempted to allocate 1m bytes of
    memory and the operating system said no.
    Could not open Stata's ds3 system for merging and appending data.
r(909);

end of do-file

r(909);

Steps to reproduce the problem

(This example reproduces the use of a huge amount of memory, but does not collapse. If needed, I can provide a bigger dataset, and a routine with more predictors.)

sysuse synth_smoking, clear
tsset state year
gen byte D = (state==3 & year>=1989)
synth_runner cigsale ///
     beer(1984(1)1988) lnincome(1972(1)1988) retprice age15to24, ///
     trunit(3) trperiod(1989) ///
     nested allopt trends gen_vars
ereturn list
single_treatment_graphs
effect_graphs
pval_graphs

System information

bquistorff commented 5 years ago

It looks like synth is running out of memory and failing first. synth_runner doesn't use much extra memory unless you're wanting confidence intervals in addition to p-values (so turn that off if you were using them). Not much I can do. Try a bigger machine.