insongkim / PanelMatch

111 stars 34 forks source link

Issue with refinement: non-numeric data exists and size.match #126

Closed mathurshreya95 closed 11 months ago

mathurshreya95 commented 1 year ago

Hi,

I'm trying to use the panel match command as follows:

matches2 <- PanelMatch(lag = 1, time.id = "time_id",
            unit.id = "unit_id",
            treatment = "novel_current",
            refinement.method = "ps.match",
            data = as.data.frame(dt),
            match.missing = F,
            covs.formula = ~ atq + emp + piq + seqq,
            qoi = "att",
            size.match = 2,
            outcome.var = 'as_spot_spend',
            lead = 0)  

However, I get the warning message:

Warning message:
In panel_match(lag, time.id, unit.id, treatment, refinement.method,  :
  non-numeric data exists. Only numeric (including binary) data can be used for refinement and calculations
  1. I checked the class of my treatment variable and covariates. They are all numeric and have no missing values. The data is not balanced in that every unit has a different year of entry but I do not understand why I am getting this warning message.
  2. Even after using size.match=2, I am getting much larger matched.sets. Could someone explain why?

Thanks!

adamrauh commented 1 year ago
  1. That warning message appears if you have any non-numeric data in the data frame that you pass to PanelMatch(). If you aren't using those columns anywhere in the matching/refinement process, then you don't need to worry about the warning.
  2. Changing size.match sets the number of control units that receive non-zero weight -- take a look at an individual matched set and you should be able to verify that. It doesn't change the number of controls matched to each treated observation based on the treatment history (which is the quantity that appears in the summary view).
mathurshreya95 commented 1 year ago

Thanks, Adam. Actually, all controls in matched sets end up receiving a non-zero weight. I have instances of >2 controls all with the same non-zero weight.

adamrauh commented 1 year ago

More than size.match control units will get a non-zero weight if those units are equally similar to the treated unit. So, for example, if 5 units have an identical distance from the treated unit, they will all get non-zero weight even if you set size.match to 2. What you're describing could happen if there's no variation in the control data.

If that's not the case let me know and I'll see if something is off!