ssdavenport / microsynth

Synthetic controls for micro-level data
16 stars 9 forks source link

Problems with match.out #17

Closed tanby91 closed 3 years ago

tanby91 commented 5 years ago

Hi,

I'm having some problems with using match.out, and I can't seem to figure out why.

1) setting match.out to be different from result.var doesn't work. For example:

sea1 <- microsynth(seattledmi,
                   idvar="ID", timevar="time", intvar="Intervention",
                   start.pre=1, end.pre=12, end.post=16,
                   match.out=match.out, match.covar=cov.var,
                   result.var="i_felony", omnibus.var=match.out,
                   test="lower")

gives the error:

Error in bigdat[, result.var[j], lows[i]:highs[i]] : subscript out of bounds

2) I'm having problems with aggregating the values in match out in a long time series panel. In my dataset where the pre-intervention period is 1096 periods long, I'm getting an error:

Error in if (m <= tol) { : argument is of length zero

when setting match.out in the format:

list("y1"=rep(7, 156), "y2"=rep(7, 156)).

I would deeply appreciate it if you could tell me I'm doing something wrong, or if it's really a bug. I attach a sample of the data and code that reproduces the errors.

sample_data.zip

ssdavenport commented 5 years ago

There does appear to be some bug where if result.var is a scalar when match.out is a vector, you’ll get an error. The package is not built for that analysis; however one way to get around this would be to specify a vector for resultvar. Would that be a good fix for you?

There may be an issue with your data. The example you sent works when you set weeks<-rep(7,105), but not when you set weeks<-rep(7,106). It is always good to check for NAs.

Another issue is that there are only 100 cases -- that is pretty small for microsynth. it won't tell you much informative with so few cases. Nor is there a guarantee that you'll be able to find a match with many constraints.

tanby91 commented 5 years ago

Yes, the scalar is not a problem. About the 2nd point, I think it has to do with the two variables being really collinear. I do have a lot more than 100 cases - I just had to drop a lot of them to get below the filesize limits on uploads. Thanks for your help!

michaelstiefel commented 4 years ago

I'm facing exactly the same issue with the subscript out of bounds error when the time-varying outcome variables in result.var are only a subset of the time-varying matching variables in match.out. I can replicate the same error message in the example of the vignette and it does not seem to be related to a scalar in my opinion:


data(seattledmi)
seattledmi <- as.data.frame(seattledmi)
set.seed(99199)

cov.var <- c("TotalPop", "BLACK", "HISPANIC", "Males_1521", "HOUSEHOLDS", 
             "FAMILYHOUS", "FEMALE_HOU", "RENTER_HOU", "VACANT_HOU")

match.out <- c("i_felony", "i_misdemea", "i_drugs", "any_crime")

result.var <- c("i_drugs", "any_crime")

sea1 <- microsynth(seattledmi, 
                   idvar="ID", timevar="time", intvar="Intervention", 
                   start.pre=1, end.pre=12, end.post=16, 
                   match.out=match.out, match.covar=cov.var, 
                   result.var=result.var, omnibus.var=result.var,
                   test="lower",
                   n.cores = min(parallel::detectCores(), 2))

summary(sea1)
plot_microsynth(sea1)

which gives the error message:

Calculating weights... Error in bigdat[, result.var[j], lows[i]:highs[i]] : subscript out of bounds

(Thanks for a really useful procedure & package btw!).

ssdavenport commented 3 years ago

Michael -- apologies for delay -- this issue has been fixed in most recent path (v2.0.20)