insongkim / PanelMatch

111 stars 34 forks source link

Handling different numbers of maximum lags in data #106

Closed emcghee73 closed 2 years ago

emcghee73 commented 2 years ago

What does PanelMatch do if some of the treated units have fewer total lags in the data than lags specified in PanelMatch? For example, if I set PanelMatch to use 2 lags (both for the initial matched set and for refinement) but treated unit X only has one lag in the data, what does PanelMatch do? Does it drop X? Or does it attempt to match on one lag?

adamrauh commented 2 years ago

Hi @emcghee73 , thanks for using the package. I have a clarifying question. Are you talking about the case of one or more treated unit missing some treatment data in the lag window?

emcghee73 commented 2 years ago

Not missing data per se, but rather the consequences of staggered treatment. So say I have panel xsection data that start at t=1, and some units are treated at t=2 and some at t=3. If I specify 2 lags, does PanelMatch drop the units that were treated at t=2?

After experimenting a little I think the answer is "yes," but it would be nice to be certain. Let me know if this makes sense or if I need to clarify further. Thanks!

kosukeimai commented 2 years ago

Yes, panelMatch drops those observations (@adamrauh can correct me if I'm mistaken). We don't allow for varying lag lengths. If you want to include it, you could append artificial data with the identical covariate values so that the algorithm will pick the best match based on the t=1 observation and essentially ignore the appended data.

emcghee73 commented 2 years ago

Thank you--very helpful!