insongkim / PanelMatch

111 stars 34 forks source link

Error in panel_match: please convert time id to consecutive integers #123

Closed sbby23 closed 1 year ago

sbby23 commented 1 year ago

I am looking to use your R Package "Panel Match" to do the quantitative analysis of my Master's thesis.

However, I always get the following error code when using the Panel Match function:

Error in panel_match(lag, time.id, unit.id, treatment, refinement.method, : please convert time id to consecutive integers

I have tired to recode the time variable in several different ways. For example, I have used 0-31, 1-32, years, quarters, etc. I have also made sure that the variable is numeric and used the "as.data.frame()" command on my dataset as suggested in previous posts and made sure R recognizes both the time and unit ID as a numeric variable. Despite these efforts, nothing has helped, which is why I am writing to you in search of help. Is there a specific format the time variable must follow? Is there anything else I can do to make the command work?

I look forward to hearing from you.

PS: here is my code:

set_democratic <- PanelMatch(lag = 16, time.id = "timecont", unit.id = "company_number", treatment = "democratleaning", refinement.method = "mahalanobis", data = date, match.missing = TRUE, covs.formula = ~ surplus + cash + longtermdebt + dividends + sales + marketvalue, size.match = 10, qoi = "ate", outcome.var = "numberdealsFDI", lead = 0:16, forbid.treatment.reversal = FALSE)

ilango2486 commented 1 year ago

HI @sbby23 ,

I faced several of these problems and figured out what the issue could be. One thing I can think of - do you have all the time IDs for all your units? e.g., unit 1 should have all of 1-32 in its timecont column, even if say rows 10,15 have no values. If you don't have some rows, you should code the values in those rows as NA. Let me know if that helps. If not, please share a sample of your dataset so we can have a look.

Another thing - having a lag of 16 seems a bit overkill. I think you should be fine if you take smaller lags

adamrauh commented 1 year ago

What @ilango2486 sounds like good advice to me. I would also note that we recently made some updates to the se_comparison branch that tries to convert your data to the right format (as opposed to just throwing an error). However, there's always chance for undefined/weird results there if the code makes some incorrect assumptions. All that to say, maybe try updating to the latest branch (from the github repo, not on CRAN yet) and see what happens.

sbby23 commented 1 year ago

Dear @ilango2486

First, thank you very much for your reply and suggestions. Please find below my responses to your comments.

@ilango2486 : I have updated my dataset. Each unit now contains 8 time IDs (each number representing a year between 2012-2019). However, I still get the same error code (both when using the year variable or the time variable). I thus attached a sample of my dataset. I have also attached my R code (in a word file as GitHub won't allow me to upload an R file). I have also reduced the lag in the PanelMatch code.

I am sorry for any inconvenience caused but very much appreciate your support. I am looking forward to hearing from you!

sample_year.xlsx sample_year.docx

sbby23 commented 1 year ago

Dear @adamrauh

First, thank you very much for your reply and suggestions. Please find below my responses to your comments.

@adamrauh : I tried to download the newest version by using the command:

" install_github("insongkim/PanelMatch", dependencies=TRUE, ref = "se_comparison") ".

However, I get the following error:

" installation of package ‘/var/folders/q5/v94ndgpx35qgq9j0w8ygp_wr0000gn/T//Rtmpgbula4/file55c82c037ff4/PanelMatch_2.0.2.tar.gz’ had non-zero exit status ".

I have tried to updated R to the newest version but I did not help (Version 2023.06.0+421). I use a Mac with MacOS Big Sur 11.7.6, is that maybe a problem?

I am sorry for any inconvenience caused but very much appreciate your support. I am looking forward to hearing from you!