insongkim / PanelMatch

111 stars 34 forks source link

What should do when the time of my data is not consecutive #113

Closed Chenxin-Sun closed 1 year ago

Chenxin-Sun commented 1 year ago

Hi, sorry to bother you ,i am currently doing undergraduate thesis research using PanelReasearch, and I got a problem that need your help. My data is not balance panel data, which means I don't have the consecutive data for each year. That I might have the data for ID==1 only in 2001,2004 and 2008, and for ID==2 only in 1997,2001,2005. What do you suggest to do to convert my data to fit the PanelMatch assumption, which requires the time data must be sequential integers that increase by 1. Should I add the extra lines with null value for ID==1 in 2002,2003,2005,2006 and 2007? Or what do you suggest to do?
Thanks a lot for your answering!!!! image

adamrauh commented 1 year ago

Hi @Chenxin-Sun , thanks for using the package. If you give the package sequential integer data, it will "balance" out the data using the min and max of the times provided. So, say the earliest year appearing in your data is 1997 and the latest is 2010. The package will create a time series for each unit from 1997 to 2010, adding missing values/NAs as appropriate.

To solve the sequential integer problem, however: I believe simply converting the time column to a factor, then to integer ought to do it. Something like as.integer(as.factor(df$time))

aslunan commented 1 year ago

Hi, there is an issue with the time.id consideration in DisplayTreatment from panelmatch. Even when the time id is consecutive integers I get this error "please convert time id to consecutive integers". Could you be able to help?

adamrauh commented 1 year ago

Closing this issue to be addressed in #116 which will be patched soon