JMSLab / xtevent

Stata package -xtevent-
MIT License
44 stars 12 forks source link

PR for #170: let the window option choose a window range #178

Closed Constantino-Carreto-Romero closed 7 months ago

Constantino-Carreto-Romero commented 7 months ago

Closes #170

jorpppp commented 7 months ago

@Constantino-Carreto-Romero If I run xtevent y eta, panelvar(i) timevar(t) policyvar(z) w(max) with the test data I get a no observations error. This is counterintuive since xtevent y eta, panelvar(i) timevar(t) policyvar(z) w(5) does run. So I would think the maximum window in this setting is at least 5. There's no imputation here, is there a bug when calculating the max window without impute?

jorpppp commented 7 months ago

@Constantino-Carreto-Romero I get an error when running xtevent y eta, panelvar(i) timevar(t) policyvar(z) w(max) impute(stag) trend(-2) This happens because the check for whether the trend starting point is in the estimation window is using the values of lwindow and rwindow and at the time of the check these are strings. Maybe the check needs to be later, and it must use the calculated window limits.

jorpppp commented 7 months ago

@Constantino-Carreto-Romero If I run xtevent y eta, panelvar(i) timevar(t) policyvar(z) w(balanced) impute(stag) I get an error message. The found window limits are -1 and -1.

This doesn't seem right. If we run

xtevent y eta, panelvar(i) timevar(t) policyvar(z) w(balanced) impute(stag) savek(event)
bys i: egen min = min(event_evtime)
bys i: egen max = max(event_evtime)
preserve
bys i: keep if _n == 1
tab min
tab max
restore 

We get that all treated units have at least two periods pretreatment and one period post treatment (the treatment period, 0). So the balanced window should be -2, 0. Why is it not the case? Am I misreading how the balanced window should be calculated?

jorpppp commented 7 months ago

@Constantino-Carreto-Romero please run more checks interacting the new window options and the other options in xtevent. The fact that I was able to find clashes when using trend or using the balanced option makes me think that we need more checking before we are ready to merge.

jorpppp commented 7 months ago

@Constantino-Carreto-Romero Note changes in #181 to avoid conflicts when merging this PR

Constantino-Carreto-Romero commented 7 months ago

@Constantino-Carreto-Romero If I run xtevent y eta, panelvar(i) timevar(t) policyvar(z) w(max) with the test data I get a no observations error. This is counterintuive since xtevent y eta, panelvar(i) timevar(t) policyvar(z) w(5) does run. So I would think the maximum window in this setting is at least 5. There's no imputation here, is there a bug when calculating the max window without impute?

@jorpppp thanks If I run the following example

use example31.dta, clear
xtset i t 
*generate standardized time variable 
gen diffpolicyvar=d1.z
gen timeoftreatment=t if diffpolicyvar==1
by i: egen ttreat= min(timeoftreatment)
gen t_relative = t - ttreat
*tabulate the standardized time variable
tab t_relative

when tabulating the standardized time variable I get the following. It indicates that the maximum range of estimation is -19, 17. But then, adjusting for the endpoints, the maximum possible window the user could specify is window(-18 16). Internally, the program tries to estimate using that window. Nonetheless, since impute is not active, the event-time dummies are generated with missing values. For instance, running the following

xtevent y, panelvar(i) timevar(t) policyvar(z) w(-18 16) savek(aa, noestimate) 

and then taking a look at the event-time dummies, or checking

count if !missing(aa_eq_m19) & !missing(aa_eq_p17)

it returns 0. Therefore, no observation succeeds in having non-missing values for all the event-time dummies. Now if I use the eventdd command, I get the following result. By default, eventdd estimates the maximum possible window, in this case, 19 leads and 17 lags (eventdd does not differentiate the endpoints from the rest of event-time dummies). eventdd succeeds at estimating because by default it assumes staggered adoption and therefore it imputes the event-time dummies and endpoints. If I now try xtevent with the impute option:

xtevent y, panelvar(i) timevar(t) policyvar(z) w(-18 16) impute(stag) 

I get the same estimation range as with eventdd. Therefore, in the case of xtevent, with the default of not activating the impute option, success in the estimation will depend on how many event-time dummies coefficients it has to estimate, according to the calculated window. If the dataset has a short observed time range, the maximum possible window will likely be short enough such that even without the impute option, the program will complete the estimation.

jorpppp commented 7 months ago

@Constantino-Carreto-Romero If I run xtevent y eta, panelvar(i) timevar(t) policyvar(z) w(max) with the test data I get a no observations error. This is counterintuive since xtevent y eta, panelvar(i) timevar(t) policyvar(z) w(5) does run. So I would think the maximum window in this setting is at least 5. There's no imputation here, is there a bug when calculating the max window without impute?

@jorpppp thanks If I run the following example

use example31.dta, clear
xtset i t 
*generate standardized time variable 
gen diffpolicyvar=d1.z
gen timeoftreatment=t if diffpolicyvar==1
by i: egen ttreat= min(timeoftreatment)
gen t_relative = t - ttreat
*tabulate the standardized time variable
tab t_relative

when tabulating the standardized time variable I get the following. It indicates that the maximum range of estimation is -19, 17. But then, adjusting for the endpoints, the maximum possible window the user could specify is window(-18 16). Internally, the program tries to estimate using that window. Nonetheless, since impute is not active, the event-time dummies are generated with missing values. For instance, running the following

xtevent y, panelvar(i) timevar(t) policyvar(z) w(-18 16) savek(aa, noestimate) 

and then taking a look at the event-time dummies, or checking

count if !missing(aa_eq_m19) & !missing(aa_eq_p17)

it returns 0. Therefore, no observation succeeds in having non-missing values for all the event-time dummies. Now if I use the eventdd command, I get the following result. By default, eventdd estimates the maximum possible window, in this case, 19 leads and 17 lags (eventdd does not differentiate the endpoints from the rest of event-time dummies). eventdd succeeds at estimating because by default it assumes staggered adoption and therefore it imputes the event-time dummies and endpoints. If I now try xtevent with the impute option:

xtevent y, panelvar(i) timevar(t) policyvar(z) w(-18 16) impute(stag) 

I get the same estimation range as with eventdd. Therefore, in the case of xtevent, with the default of not activating the impute option, success in the estimation will depend on how many event-time dummies coefficients it has to estimate, according to the calculated window. If the dataset has a short observed time range, the maximum possible window will likely be short enough such that even without the impute option, the program will complete the estimation.

@Constantino-Carreto-Romero What I am reading from this is that the criteria to select the maximum window that we are using only works for staggered adoption, and that we need to figure out a criterion that works outside the staggered adoption case. Can you figure out what the criterion is?

jorpppp commented 7 months ago

@Constantino-Carreto-Romero If I run xtevent y eta, panelvar(i) timevar(t) policyvar(z) w(max) with the test data I get a no observations error. This is counterintuive since xtevent y eta, panelvar(i) timevar(t) policyvar(z) w(5) does run. So I would think the maximum window in this setting is at least 5. There's no imputation here, is there a bug when calculating the max window without impute?

@jorpppp thanks If I run the following example

use example31.dta, clear
xtset i t 
*generate standardized time variable 
gen diffpolicyvar=d1.z
gen timeoftreatment=t if diffpolicyvar==1
by i: egen ttreat= min(timeoftreatment)
gen t_relative = t - ttreat
*tabulate the standardized time variable
tab t_relative

when tabulating the standardized time variable I get the following. It indicates that the maximum range of estimation is -19, 17. But then, adjusting for the endpoints, the maximum possible window the user could specify is window(-18 16). Internally, the program tries to estimate using that window. Nonetheless, since impute is not active, the event-time dummies are generated with missing values. For instance, running the following

xtevent y, panelvar(i) timevar(t) policyvar(z) w(-18 16) savek(aa, noestimate) 

and then taking a look at the event-time dummies, or checking

count if !missing(aa_eq_m19) & !missing(aa_eq_p17)

it returns 0. Therefore, no observation succeeds in having non-missing values for all the event-time dummies. Now if I use the eventdd command, I get the following result. By default, eventdd estimates the maximum possible window, in this case, 19 leads and 17 lags (eventdd does not differentiate the endpoints from the rest of event-time dummies). eventdd succeeds at estimating because by default it assumes staggered adoption and therefore it imputes the event-time dummies and endpoints. If I now try xtevent with the impute option:

xtevent y, panelvar(i) timevar(t) policyvar(z) w(-18 16) impute(stag) 

I get the same estimation range as with eventdd. Therefore, in the case of xtevent, with the default of not activating the impute option, success in the estimation will depend on how many event-time dummies coefficients it has to estimate, according to the calculated window. If the dataset has a short observed time range, the maximum possible window will likely be short enough such that even without the impute option, the program will complete the estimation.

@Constantino-Carreto-Romero What I am reading from this is that the criteria to select the maximum window that we are using only works for staggered adoption, and that we need to figure out a criterion that works outside the staggered adoption case. Can you figure out what the criterion is?

Per call: we won't try to deal with the case of automatic window choice outside staggered adoption. @Constantino-Carreto-Romero

jorpppp commented 7 months ago

@Constantino-Carreto-Romero we won't deal with the case outside staggered adoption. So the automatic window generation commands should check for staggered adoption before proceding, and return an error otherwise.

The other open comments in this PR are also pending.

jorpppp commented 7 months ago

@Constantino-Carreto-Romero Keep the changes in #187 in mind when merging this PR

Constantino-Carreto-Romero commented 7 months ago

@Constantino-Carreto-Romero If I run xtevent y eta, panelvar(i) timevar(t) policyvar(z) w(max) with the test data I get a no observations error. This is counterintuive since xtevent y eta, panelvar(i) timevar(t) policyvar(z) w(5) does run. So I would think the maximum window in this setting is at least 5. There's no imputation here, is there a bug when calculating the max window without impute?

@jorpppp In https://github.com/JMSLab/xtevent/commit/8cc6718bdb4ae01dbef3c530bd6914900be10e42 I added a condition where the use of window(max | balanced) is allowed only if the option impute(stag | instag) is specified. The use of impute(stag | instag) helps check: i) wheter the policyvar follows staggered adoption and ii) impute the policyvar in case it does follow staggered adoption. Then, this translates into no missing values in the event-time dummies and prevents the cases where the maximum window cannot be estimated due to missing values in the event-time dummies (e.g., https://github.com/JMSLab/xtevent/pull/178#issuecomment-2068325785). This change is intended to implement https://github.com/JMSLab/xtevent/pull/178#issuecomment-2070911411. For instance, adding impute(stag) to the same example:

use example31, clear
xtevent y eta , panelvar(i) timevar(t) policyvar(z) impute(stag) window(max)

completes the estimation and calculates a window of (-18 16) plus the two endpoints. I add more relate checks in the do-file in the issue folder.

Constantino-Carreto-Romero commented 7 months ago

@Constantino-Carreto-Romero I get an error when running xtevent y eta, panelvar(i) timevar(t) policyvar(z) w(max) impute(stag) trend(-2) This happens because the check for whether the trend starting point is in the estimation window is using the values of lwindow and rwindow and at the time of the check these are strings. Maybe the check needs to be later, and it must use the calculated window limits.

@jorpppp In https://github.com/JMSLab/xtevent/commit/de25d54a0c276cce974e52e8fce47720d166f612 and https://github.com/JMSLab/xtevent/commit/9fcf9b35c789fbda477e942821b8195d2df2b2f4 I kept the check in _eventols but conditioning that window must be numeric. Then, in _eventgenvars, once the window is calculated (if window(max | balanced) was specified), the same check is carried out. The check must be placed in _eventgenvars, before the code that specifies the coefficients that will be normalized. I kept the check in _eventols for the numeric window case, so the program can stop at that instance and not let it stop until _eventgenvars. Should I change it to a single check in _eventgenvars?

Constantino-Carreto-Romero commented 7 months ago

@Constantino-Carreto-Romero If I run xtevent y eta, panelvar(i) timevar(t) policyvar(z) w(balanced) impute(stag) I get an error message. The found window limits are -1 and -1.

This doesn't seem right. If we run

xtevent y eta, panelvar(i) timevar(t) policyvar(z) w(balanced) impute(stag) savek(event)
bys i: egen min = min(event_evtime)
bys i: egen max = max(event_evtime)
preserve
bys i: keep if _n == 1
tab min
tab max
restore 

We get that all treated units have at least two periods pretreatment and one period post treatment (the treatment period, 0). So the balanced window should be -2, 0. Why is it not the case? Am I misreading how the balanced window should be calculated?

@jorpppp The calculated window is (-1 -1) plus the endpoints (-2+, 0+). The calculated limits are indeed (-2 0) but I follow the definition used for a numeric input for window and separate the periods covered by window and the endpoints. After the changes in https://github.com/JMSLab/xtevent/commit/267cef77ee4a32a1c26cce9d41e129d88cd45be0, running the same example produces this more informative error message.

jorpppp commented 7 months ago

@Constantino-Carreto-Romero I get an error when running xtevent y eta, panelvar(i) timevar(t) policyvar(z) w(max) impute(stag) trend(-2) This happens because the check for whether the trend starting point is in the estimation window is using the values of lwindow and rwindow and at the time of the check these are strings. Maybe the check needs to be later, and it must use the calculated window limits.

@jorpppp In de25d54 and 9fcf9b3 I kept the check in _eventols but conditioning that window must be numeric. Then, in _eventgenvars, once the window is calculated (if window(max | balanced) was specified), the same check is carried out. The check must be placed in _eventgenvars, before the code that specifies the coefficients that will be normalized. I kept the check in _eventols for the numeric window case, so the program can stop at that instance and not let it stop until _eventgenvars. Should I change it to a single check in _eventgenvars?

@Constantino-Carreto-Romero Thanks. It's ok to keep the check in both places.

jorpppp commented 7 months ago

@Constantino-Carreto-Romero If I run xtevent y eta, panelvar(i) timevar(t) policyvar(z) w(balanced) impute(stag) I get an error message. The found window limits are -1 and -1. This doesn't seem right. If we run

xtevent y eta, panelvar(i) timevar(t) policyvar(z) w(balanced) impute(stag) savek(event)
bys i: egen min = min(event_evtime)
bys i: egen max = max(event_evtime)
preserve
bys i: keep if _n == 1
tab min
tab max
restore 

We get that all treated units have at least two periods pretreatment and one period post treatment (the treatment period, 0). So the balanced window should be -2, 0. Why is it not the case? Am I misreading how the balanced window should be calculated?

@jorpppp The calculated window is (-1 -1) plus the endpoints (-2+, 0+). The calculated limits are indeed (-2 0) but I follow the definition used for a numeric input for window and separate the periods covered by window and the endpoints. After the changes in 267cef7, running the same example produces this more informative error message.

@Constantino-Carreto-Romero Thanks. The error message is still a bit confusing. Maybe write "the calculated window is (-1,-1), excluding the endpoints". The -2+, 0+ is confusing, I don't know what the plus signs mean. So I would omit it.

In testing this I found a possible bug: xtevent y, w(0 3) pol(z) impute(stag) returns an error with the example31 data. Can you check please?

Constantino-Carreto-Romero commented 7 months ago

@Constantino-Carreto-Romero If I run xtevent y eta, panelvar(i) timevar(t) policyvar(z) w(balanced) impute(stag) I get an error message. The found window limits are -1 and -1. This doesn't seem right. If we run

xtevent y eta, panelvar(i) timevar(t) policyvar(z) w(balanced) impute(stag) savek(event)
bys i: egen min = min(event_evtime)
bys i: egen max = max(event_evtime)
preserve
bys i: keep if _n == 1
tab min
tab max
restore 

We get that all treated units have at least two periods pretreatment and one period post treatment (the treatment period, 0). So the balanced window should be -2, 0. Why is it not the case? Am I misreading how the balanced window should be calculated?

@jorpppp The calculated window is (-1 -1) plus the endpoints (-2+, 0+). The calculated limits are indeed (-2 0) but I follow the definition used for a numeric input for window and separate the periods covered by window and the endpoints. After the changes in 267cef7, running the same example produces this more informative error message.

@Constantino-Carreto-Romero Thanks. The error message is still a bit confusing. Maybe write "the calculated window is (-1,-1), excluding the endpoints". The -2+, 0+ is confusing, I don't know what the plus signs mean. So I would omit it.

@jorpppp I edited the message in https://github.com/JMSLab/xtevent/commit/8409db79fc408499f779215ade1eb9fe0f43f28b. Is it ok?

In testing this I found a possible bug: xtevent y, w(0 3) pol(z) impute(stag) returns an error with the example31 data. Can you check please?

I tested it and I got the error message. The error comes from this line https://github.com/JMSLab/xtevent/blob/8409db79fc408499f779215ade1eb9fe0f43f28b/xtevent/_eventgenvars.ado#L454 where the temporary variable being analyzed was not defined because this code section https://github.com/JMSLab/xtevent/blob/8409db79fc408499f779215ade1eb9fe0f43f28b/xtevent/_eventgenvars.ado#L430 was not executed because the left window is 0. I'm not sure whether it is a bug. It is not related to the automatic-window implementation, can I open a separate issue to check it?

jorpppp commented 7 months ago

@Constantino-Carreto-Romero If I run xtevent y eta, panelvar(i) timevar(t) policyvar(z) w(balanced) impute(stag) I get an error message. The found window limits are -1 and -1. This doesn't seem right. If we run

xtevent y eta, panelvar(i) timevar(t) policyvar(z) w(balanced) impute(stag) savek(event)
bys i: egen min = min(event_evtime)
bys i: egen max = max(event_evtime)
preserve
bys i: keep if _n == 1
tab min
tab max
restore 

We get that all treated units have at least two periods pretreatment and one period post treatment (the treatment period, 0). So the balanced window should be -2, 0. Why is it not the case? Am I misreading how the balanced window should be calculated?

@jorpppp The calculated window is (-1 -1) plus the endpoints (-2+, 0+). The calculated limits are indeed (-2 0) but I follow the definition used for a numeric input for window and separate the periods covered by window and the endpoints. After the changes in 267cef7, running the same example produces this more informative error message.

@Constantino-Carreto-Romero Thanks. The error message is still a bit confusing. Maybe write "the calculated window is (-1,-1), excluding the endpoints". The -2+, 0+ is confusing, I don't know what the plus signs mean. So I would omit it.

@jorpppp I edited the message in 8409db7. Is it ok?

@Constantino-Carreto-Romero let me make a minor change to it in a push. After that I'll test the code a bit more and if it's ok I'll merge the PR.

In testing this I found a possible bug: xtevent y, w(0 3) pol(z) impute(stag) returns an error with the example31 data. Can you check please?

I tested it and I got the error message. The error comes from this line

https://github.com/JMSLab/xtevent/blob/8409db79fc408499f779215ade1eb9fe0f43f28b/xtevent/_eventgenvars.ado#L454

where the temporary variable being analyzed was not defined because this code section https://github.com/JMSLab/xtevent/blob/8409db79fc408499f779215ade1eb9fe0f43f28b/xtevent/_eventgenvars.ado#L430

was not executed because the left window is 0. I'm not sure whether it is a bug. It is not related to the automatic-window implementation, can I open a separate issue to check it?

@Constantino-Carreto-Romero Yes, please open a new issue for this.

jorpppp commented 7 months ago

@Constantino-Carreto-Romero We still have bugs in this. With the example31 data:

gen pois = rpoisson(5) in 1/200
xtevent y eta i.pois, panelvar(i) timevar(t) pol(z) window(max) impute(stag)

returns an error message. I presume touse is not being set somewhere and the calculated window is using the entire sample and not the touse sample. Note that if we pass a sample restriction to xtevent this works:

xtevent y eta i.pois if pois!=., panelvar(i) timevar(t) pol(z) window(max) impute(stag)

Can you check this please?

Constantino-Carreto-Romero commented 7 months ago

@Constantino-Carreto-Romero We still have bugs in this. With the example31 data:

gen pois = rpoisson(5) in 1/200
xtevent y eta i.pois, panelvar(i) timevar(t) pol(z) window(max) impute(stag)

returns an error message. I presume touse is not being set somewhere and the calculated window is using the entire sample and not the touse sample. Note that if we pass a sample restriction to xtevent this works:

xtevent y eta i.pois if pois!=., panelvar(i) timevar(t) pol(z) window(max) impute(stag)

Can you check this please?

@jorpppp I tried running the same example but cannot replicate the error message. When running

use example31, clear
gen pois = rpoisson(5) in 1/200
xtevent y eta i.pois, panelvar(i) timevar(t) pol(z) window(max) impute(stag)

I get the following outcome.

jorpppp commented 7 months ago

@Constantino-Carreto-Romero We still have bugs in this. With the example31 data:

gen pois = rpoisson(5) in 1/200
xtevent y eta i.pois, panelvar(i) timevar(t) pol(z) window(max) impute(stag)

returns an error message. I presume touse is not being set somewhere and the calculated window is using the entire sample and not the touse sample. Note that if we pass a sample restriction to xtevent this works:

xtevent y eta i.pois if pois!=., panelvar(i) timevar(t) pol(z) window(max) impute(stag)

Can you check this please?

@jorpppp I tried running the same example but cannot replicate the error message. When running

use example31, clear
gen pois = rpoisson(5) in 1/200
xtevent y eta i.pois, panelvar(i) timevar(t) pol(z) window(max) impute(stag)

I get the following outcome.

@Constantino-Carreto-Romero did not get the error because he wasn't trying to plot. Strictly speaking the error message appears with

xtevent y eta i.pois, panelvar(i) timevar(t) pol(z) window(max) impute(stag) plot

In any case there is a bug because the calculated window is not taking the missing values of the control variables into account, but it should. The issue has to do with the way samples are marked throughout: _eventgenvars is working with a sample that has not been marked for missing values of the variables, which is convenient for other reasons, e.g. imputation. @Constantino-Carreto-Romero will check this.

Constantino-Carreto-Romero commented 7 months ago

@Constantino-Carreto-Romero We still have bugs in this. With the example31 data:

gen pois = rpoisson(5) in 1/200
xtevent y eta i.pois, panelvar(i) timevar(t) pol(z) window(max) impute(stag)

returns an error message. I presume touse is not being set somewhere and the calculated window is using the entire sample and not the touse sample. Note that if we pass a sample restriction to xtevent this works:

xtevent y eta i.pois if pois!=., panelvar(i) timevar(t) pol(z) window(max) impute(stag)

Can you check this please?

@jorpppp I tried running the same example but cannot replicate the error message. When running

use example31, clear
gen pois = rpoisson(5) in 1/200
xtevent y eta i.pois, panelvar(i) timevar(t) pol(z) window(max) impute(stag)

I get the following outcome.

@Constantino-Carreto-Romero did not get the error because he wasn't trying to plot. Strictly speaking the error message appears with

xtevent y eta i.pois, panelvar(i) timevar(t) pol(z) window(max) impute(stag) plot

In any case there is a bug because the calculated window is not taking the missing values of the control variables into account, but it should. The issue has to do with the way samples are marked throughout: _eventgenvars is working with a sample that has not been marked for missing values of the variables, which is convenient for other reasons, e.g. imputation. @Constantino-Carreto-Romero will check this.

@jorpppp in https://github.com/JMSLab/xtevent/commit/6dcd33c45d6b77922f60bd2c02dc4d84e707fb2e I modified the syntax of _eventgenvars, so it can handle the local varlist. Then, I added a marksample to mark the non-missing observations in varlist and consider this restriction when calculating the window limits. This doesn't affect the already defined mark because the latter only considers if or in restrictions. I also updated the calls to _eventgenvars from other programs. I include in the do-file in the issue folder examples to test that:

examples: https://github.com/JMSLab/xtevent/blob/97dff856316fb4f9fcdb8abad2016b63f0a46791/issue170/issue170.do#L78

Constantino-Carreto-Romero commented 7 months ago

@jorpppp in https://github.com/JMSLab/xtevent/pull/178/commits/97dff856316fb4f9fcdb8abad2016b63f0a46791 I uploaded an issue folder with a do-file with examples to test: i) error messages; ii) correct functionality; iii) interaction with other options and iv; that the implementation doesn't alter other functionalities.

jorpppp commented 7 months ago

@Constantino-Carreto-Romero We still have bugs in this. With the example31 data:

gen pois = rpoisson(5) in 1/200
xtevent y eta i.pois, panelvar(i) timevar(t) pol(z) window(max) impute(stag)

returns an error message. I presume touse is not being set somewhere and the calculated window is using the entire sample and not the touse sample. Note that if we pass a sample restriction to xtevent this works:

xtevent y eta i.pois if pois!=., panelvar(i) timevar(t) pol(z) window(max) impute(stag)

Can you check this please?

@jorpppp I tried running the same example but cannot replicate the error message. When running

use example31, clear
gen pois = rpoisson(5) in 1/200
xtevent y eta i.pois, panelvar(i) timevar(t) pol(z) window(max) impute(stag)

I get the following outcome.

@Constantino-Carreto-Romero did not get the error because he wasn't trying to plot. Strictly speaking the error message appears with

xtevent y eta i.pois, panelvar(i) timevar(t) pol(z) window(max) impute(stag) plot

In any case there is a bug because the calculated window is not taking the missing values of the control variables into account, but it should. The issue has to do with the way samples are marked throughout: _eventgenvars is working with a sample that has not been marked for missing values of the variables, which is convenient for other reasons, e.g. imputation. @Constantino-Carreto-Romero will check this.

@jorpppp in 6dcd33c I modified the syntax of _eventgenvars, so it can handle the local varlist. Then, I added a marksample to mark the non-missing observations in varlist and consider this restriction when calculating the window limits. This doesn't affect the already defined mark because the latter only considers if or in restrictions. I also updated the calls to _eventgenvars from other programs. I include in the do-file in the issue folder examples to test that:

  • the calculation of the window limits considers missing values in the local varlist
  • the consideration of if or in conditions were not affected

examples:

https://github.com/JMSLab/xtevent/blob/97dff856316fb4f9fcdb8abad2016b63f0a46791/issue170/issue170.do#L78

Thanks @Constantino-Carreto-Romero . This approach works but has a slight disadvantage. _eventgenvars is recalculating the marked sample that considers missing values in the variables. But this marked sample was already calculated by _eventols or _eventiv.

So instead of passing the varlist to _eventgenvars to repeat the marking, we can just pass the marked sample variable from _eventols or _eventiv to _eventgenvars as an option.

Constantino-Carreto-Romero commented 7 months ago

Thanks @Constantino-Carreto-Romero . This approach works but has a slight disadvantage. _eventgenvars is recalculating the marked sample that considers missing values in the variables. But this marked sample was already calculated by _eventols or _eventiv.

So instead of passing the varlist to _eventgenvars to repeat the marking, we can just pass the marked sample variable from _eventols or _eventiv to _eventgenvars as an option.

@jorpppp thanks. In https://github.com/JMSLab/xtevent/commit/d7ae4e6ee6bb3a31b238ac27ab48799071062b5e I reverted the changes to _eventgenvars's syntax and instead of, I passed the marked sample variable from _eventols, _eventiv, etc. as an option in _eventgenvars. Then, I ran the do-file in the issue folder to check that this change worked.

Constantino-Carreto-Romero commented 7 months ago

When running

use example31, clear
xtevent y eta, panelvar(i) timevar(t) pol(z) impute(stag) window(max) diffavg 

There is a bug because we get a calculated left endpoint of -19, but the computation of the differences of averages only includes until coefficient -9.