JMSLab / xtevent

Stata package -xtevent-
MIT License
44 stars 12 forks source link

invalid syntax error when specifying a variable as proxyiv #114

Closed Constantino-Carreto-Romero closed 1 year ago

Constantino-Carreto-Romero commented 2 years ago

In the IV setting, when specifying a variable as proxyiv, it produces an invalid syntax error. For instance:

*load example dataset
use "https://github.com/JMSLab/xtevent/blob/main/test/example31.dta?raw=true", clear
xtset i t
xtevent y x, panelvar(i) t(t) policyvar(z) window(3) proxy(zeta) proxyiv(eta)
jorpppp commented 2 years ago

@Constantino-Carreto-Romero This used to work, maybe some of the changes to the parser broke something? Or is this a bug on the latest published version?

Also, shouldn't this show up when running the test file? Why did we miss it?

Constantino-Carreto-Romero commented 2 years ago

@jorpppp I executed it with xtevent 1.0.0 and it showed up the same error. In the test file, the example that uses a variable as proxyiv, use a collinear variable, so the collinearity error showed first (this expected error is muted in the test file, so it doesn't stop the execution). However, in the test file, there is not an example with a non-collinear variable, so this error doesn't show up when I execute the test file. I found this bug while testing #111

jorpppp commented 2 years ago

@Constantino-Carreto-Romero Now that #111 is closed, can you take a closer look at this, please?

Constantino-Carreto-Romero commented 2 years ago

@jorpppp sure thing, I already started to check it.

Constantino-Carreto-Romero commented 2 years ago

@jorpppp It seems that there are two bugs:

jorpppp commented 2 years ago

@jorpppp It seems that there are two bugs:

  • In this block of code, that goes from line 161 to line 175, for a proxyiv that is only numbers (numlist) or numbers and variable names (mixed), we want to include in the local ivnorm all the integers that indicate leads to use as instruments. However, the way it is designed, whenever a variable name is included as an instrument along with some lead orders (i.e., the user specifies a mixed proxyiv), this error message will be shown. This is preventing the user from using a mixed proxyiv. I think the solution is to change the foreach loop, so it loops over a version of proxyiv that doesn't contain variable names. This change doesn't affect subsequent code.
  • In that block of code, that check is done only for a numlist or a mixed proxyiv. If the user specifies only variable names as proxyiv (i.e., a varlist proxyiv), the local ivnorm stays empty. Then, in the code for the scaling factor, that goes from line 468 to 480, it always requires that the local ivnorm to be non-empty. This is what causes the message error pointed out in invalid syntax error when specifying a variable as proxyiv #114 (comment). In this setting, where the user specified only variable names as proxyiv and thus no lead order is specified, for the scaling factor computation, ¿what coefficient number should be used as a reference? For the scaling factor computation, the program uses the smallest lead order among the specified lead orders. I think in the case where no lead order is specified, we could make the program use the coefficients the user indicated in the option norm.

Thanks @Constantino-Carreto-Romero. This makes sense. The fix you suggest for the loop sounds good.

The value of ivnorm for the case when no leads are used should be -1.

Can you try to implement the fixes, please?

Constantino-Carreto-Romero commented 2 years ago

@jorpppp sure thing, I will start working on implementing it.

Constantino-Carreto-Romero commented 2 years ago
  • In this block of code, that goes from line 161 to line 175, for a proxyiv that is only numbers (numlist) or numbers and variable names (mixed), we want to include in the local ivnorm all the integers that indicate leads to use as instruments. However, the way it is designed, whenever a variable name is included as an instrument along with some lead orders (i.e., the user specifies a mixed proxyiv), this error message will be shown. This is preventing the user from using a mixed proxyiv. I think the solution is to change the foreach loop, so it loops over a version of proxyiv that doesn't contain variable names. This change doesn't affect subsequent code.

I implemented this in https://github.com/JMSLab/xtevent/commit/0fec5c442b1648f17c25df66ea2ec93f00dbca42.

  • In that block of code, that check is done only for a numlist or a mixed proxyiv. If the user specifies only variable names as proxyiv (i.e., a varlist proxyiv), the local ivnorm stays empty. Then, in the code for the scaling factor, that goes from line 468 to 480, it always requires that the local ivnorm to be non-empty. This is what causes the message error pointed out in invalid syntax error when specifying a variable as proxyiv #114 (comment). In this setting, where the user specified only variable names as proxyiv and thus no lead order is specified, for the scaling factor computation, ¿what coefficient number should be used as a reference? For the scaling factor computation, the program uses the smallest lead order among the specified lead orders. I think in the case where no lead order is specified, we could make the program use the coefficients the user indicated in the option norm.

It is not possible to set ivnorm to -1 because if the normalized coefficient is norm=-1, the program doesn't compute that coefficient. The coefficient to compute the scaling factor needs to be different from the normalized coefficient. ¿What coefficient shoul we use? Temporarily, in https://github.com/JMSLab/xtevent/commit/adbdb36bfad11b0f857d9a7cdd814f03ab15cb88 I set this coefficient to be equal to norm-1.

While testing this implementation, I also noticed that the change of lead order carried out in this part of the code, only covers the case where proxyiv= norm=-1, but if they are equal to another integer, later an error appears. So in https://github.com/JMSLab/xtevent/commit/adbdb36bfad11b0f857d9a7cdd814f03ab15cb88 I also added some changes to cover those other cases. Tentatively, I added a warning message.

These are some examples check these implementations:

*install the branch version 
net install xtevent, from("https://raw.githubusercontent.com/JMSLab/xtevent/issue114_proxyiv_doesnt_allow_varnames") replace

*load example dataset
use "https://github.com/JMSLab/xtevent/blob/main/test/example31.dta?raw=true", clear
xtset i t

*default case
xtevent y x, panelvar(i) t(t) policyvar(z) window(3) proxy(zeta) 

*Mixed proxyiv
*This proxyiv type used to generate an error.  
xtevent y x, panelvar(i) t(t) policyvar(z) window(3) proxy(zeta) proxyiv(2 3 eta)

*varlist proxyiv 
*This proxyiv type used to generate an error. 
*since norm = -1, the scaling factor is computed based on coefficient -2
xtevent y x, panelvar(i) t(t) policyvar(z) window(3) proxy(zeta) proxyiv(eta alpha) 

*Case when a lead order = normalized coefficient.
* This change was done by the program only for the case: lead order = normalized coefficient = -1.
*But in other cases it generated an error.
*the program changes the lead order and shows a warning message.
xtevent y x, panelvar(i) t(t) policyvar(z) window(3) proxy(zeta) proxyiv(1) norm(-1)
xtevent y x, panelvar(i) t(t) policyvar(z) window(3) proxy(zeta) proxyiv(2) norm(-2)
jorpppp commented 2 years ago

Thanks @Constantino-Carreto-Romero. In cases when we have external instruments, we should normalize one coefficient per instrumental variable as long as the # of instruments is smaller than the LHS of the window. Can you implement this please?

Constantino-Carreto-Romero commented 2 years ago

update: I'm currently working on it

Constantino-Carreto-Romero commented 2 years ago

@jorpppp in https://github.com/JMSLab/xtevent/commit/ddf03f73e81c6ebe97349198cfede59afa232c41 I implemented https://github.com/JMSLab/xtevent/issues/114#issuecomment-1275257341

These are some examples to check the implementation. This is the log file.

jorpppp commented 2 years ago

Per call:

jorpppp commented 2 years ago

Also note that we should change the language of the messages following #116

Constantino-Carreto-Romero commented 2 years ago

@jorpppp

Constantino-Carreto-Romero commented 1 year ago

Per call: the implementation looks correct. There are only some examples that need small changes. After that, @Constantino-Carreto-Romero can open the PR

Constantino-Carreto-Romero commented 1 year ago

In https://github.com/JMSLab/xtevent/commit/429f9d5487c70026a378e889474da32f7f08f828 I implemented small changes to some examples to check the implementation. I also uploaded the log file. Both of them are placed in the issue folder. I ran the test file again and checked that xteventplot keeps working properly. For instance, when executing:

xtevent y x, panelvar(i) t(t) policyvar(z) window(4) proxy(zeta) proxyiv(1 eta u)

it produces the following plot, where the ommited coefficients are -1,-2,-3, and -4.