Daniel-Pailanir / sdid

Synthetic Difference in Differences for Stata
GNU General Public License v3.0
72 stars 36 forks source link

separate the two graph requests #40

Closed zhangxian8711 closed 1 year ago

zhangxian8711 commented 1 year ago

Hello, I am using synthDID with a fairly big treated group. One feature I noticed is that I could only store the weights together with graph options( both graphs if my understanding is correct) . However in my use case , the graph that shows the weights are not implementable because of the size of my treated group, which makes it also impossible to export the group mean graph neither . Is it possible to separate these two so that we could still get the group mean graph ?

Daniel-Pailanir commented 1 year ago

Hello! First, you can access the time and unit weights without using the graph option by using the e(lambda) and e(omega) matrix. The latest version of sdid, on the other hand, allows you to export only the trend graph by default (skipping the first one in older versions, if you want to access this type g1on), so I recommend reinstalling the github version and then run an example without inference to plot and save the weights, something like sdid Y G T D, vce(noinference) and then matrix lambda=e(lambda) and matrix omega=e(omega)

zhangxian8711 commented 1 year ago

Thanks for your prompt response. I guess I was not using the most up to date version. Another question I have is that when I run the code you mentioned above , the results were ok, I also see the weights generated . However when I include the graph option , I received the error message saying that the groupid is not uniquely identify observations in the using data. My understanding is that I just need to make sure the groupid and timevar uniquely identify the observation, is my understanding wrong ?

Daniel-Pailanir commented 1 year ago

What you say is correct. You need to make sure the panel is strongly balanced so it's weird that error when you add the graph option, I was looking through the code trying to identify any errors but I can't find anything out of order so if you have any examples of data or code to see please write to me: dpailanir@fen.uchile.cl

zhangxian8711 commented 1 year ago

Hi Daniel-Pailanir, Sorry for the late reply. I cannot share data with you unfortunately but I think I figured out the reason behind. My data is strongly balanced and that is why the command sdid works. However, my groupid is converted from string variables so it can be quite big as a integer (In my data , the groupid is stored as double data type) I wanted to manually generate the graph using the weights from running sdid, so I was converting the weight matrix e(omega) into two variables, one I believe is the groupid and the other being the weight value. My plan was to merge the weight dataset using groupid back to the main dataset and then generate weighted average of outcomes for plotting. However, I received an error saying the groupid cannot uniquely identify observations in the weight dataset converted from e(omega). There were five different weights tied to one particularly big groupid. What is more wierd is that I cannot find this groupid value in the main dataset... In the end, it turned out that in the weight dataset generated from matrix e(omega), groupid was stored as float, which results in rounding errors and end up with duplicates. Once I replace the groupid with their row number, so that it's much smaller, the problem disappeared. I did not really read your ado file so I am not sure if this is the exact cause, but hopefully this can provide some helpful lead. Thanks

Daniel-Pailanir commented 1 year ago

Thank you very much for the details in your explanation. I think this is an issue you mentioned about a very long group identifier. In the function we internally make a new identifier using something like egen newID = group (yourID) which gives us an ID from 1 to N so I think doing this step before executing the sdid command will work better. Thank you very much for this comment, it will help us to improve the code.