jkcshea / ivmte

An R package for implementing the method in Mogstad, Santos, and Torgovitsky (2018, Econometrica).
GNU General Public License v3.0
18 stars 2 forks source link

Bootstrap output is confusing #62

Closed a-torgovitsky closed 5 years ago

a-torgovitsky commented 5 years ago

Right now it looks like this:

Obtaining propensity scores...

Generating target moments...

    Integrating terms for control group...

    Integrating terms for treated group...

Generating IV-like moments...
    Moment 1...
    Moment 2...
    Moment 3...
    Moment 4...

Performing audit procedure...

Audit count: 1 
Minimum criterion: 0 

Obtaining bounds...

Audit ending: no violations of monotonicity or boundedness restrictions by points chosen off of the grid defining shape restrictions for the LP problem. 
Bounds on the target parameter: [0.286442508429307, 0.286442508429307]

Audit count: 1 
Minimum criterion: 0 

Obtaining bounds...

Audit ending: no violations of monotonicity or boundedness restrictions by points chosen off of the grid defining shape restrictions for the LP problem. 
Bounds on the target parameter: [0.142160579026397, 0.142160579026397]

Bootstrap iteration 1...
Audit count: 1 
Minimum criterion: 0 

Obtaining bounds...

Audit ending: no violations of monotonicity or boundedness restrictions by points chosen off of the grid defining shape restrictions for the LP problem. 
Bounds on the target parameter: [0.306351404539633, 0.306351404539633]

Bootstrap iteration 2...
Audit count: 1 
Minimum criterion: 0 

Obtaining bounds...

Audit ending: no violations of monotonicity or boundedness restrictions by points chosen off of the grid defining shape restrictions for the LP problem. 
Bounds on the target parameter: [0.27060920887329, 0.27060920887329]

Bootstrap iteration 3...
Warning messages:
1: No list of components provided. All covariates in each IV-like specification will be included when constructing each S-set. 
2: In ivmte(data = df, ivlike = ivlike, target = "ate", bootstraps = 3,  :
  'treat' argument is not declared. Dependent variable from the propensity score formula, 'd', will be used as the treatment variable.

A few issues with that

  1. The statement "bootstrap iteration x" seems to come AFTER the output for the bootstrap?
  2. No need to have "Obtaining bounds" each time
  3. Generally too much spacing, hard to read
  4. No summary statement at the end?

So let's make it look like this:

Bootstrap iteration 1:
    Audit count: x1
    Minimum criterion: y1
    Bounds: [a1, b1] # or Point estimate: a1
Bootstrap iteration 2:
    Audit count: x2
    Minimum criterion: y2
    Bounds: [a2, b2] # or Point estimate: a2
...
Bootstrapped confidence intervals:
99%: [, ]
95%: [, ]
90%: [, ]
a-torgovitsky commented 5 years ago

Also, add the p-value below the final confidence interval

jkcshea commented 5 years ago

Done!

I decided to include a count of total bootstraps, and failed bootstraps. So output looks like this now:

...
Bootstrap iteration 99...
    Audit count: 1
    Minimum criterion: 0.176757597496557
    Bounds:[-0.318856082881249, -0.192836279564623]
Bootstrap iteration 100...
    Audit count: 1
    Minimum criterion: 0.0777525210320685
    Bounds:[-0.352005784436018, -0.295006113361368]

Bootstrap summary:
    Number of bootstraps: 100
    Failed bootstraps: 1

Bootstrapped confidence intervals (backward):
    90%: [-0.383598585256906, -0.0964734284951312]
    95%: [-0.399875498865055, -0.0735710866056285]
    99%: [-0.43570026089672, 0.00604090604091434]

Bootstrapped confidence intervals (forward):
    90%: [-0.449013691626266, -0.0848656684243104]
    95%: [-0.468142179461607, -0.0768431839542707]
    99%: [-0.487511434543914, -0.00621765235860217]

Bootstrapped p-values: 
    Backward: 0.02
    Forward:  0.01

Do you want to round any of those results?

a-torgovitsky commented 5 years ago

Looks great, but a couple of questions:

  1. What is a failed bootstrap?

  2. Are you outputting

    Bootstrap iteration 99...
    Audit count: 1
    Minimum criterion: 0.176757597496557
    Bounds:[-0.318856082881249, -0.192836279564623]

    for each of the 100 iterations? If so there should be a way to turn this off if the user wants.

  3. Do we have a general rounding scheme? What do standard R commands do on this?

jkcshea commented 5 years ago

What is a failed bootstrap?

This is when the resampling results in, say, an infeasible LP problem, so no bounds are obtained.

The issue was brought up here: https://github.com/jkcshea/IVMTE/issues/31#issuecomment-441506695

You explained that it was not good to discard these 'failed bootstraps' and just resample, as that would introduce sample selection. It seems we never concluded on what do when a bootstrap fails, though.

Are you outputting... for each of the 100 iterations

Only if noisy = TRUE. If noisy = FALSE, only the final output is given.

Do we have a general rounding scheme?

So far, we have no standard rounding scheme.

What do standard R commands do on this?

Playing with R, it seems like it will round to 7 significant figures when decimals are involved.

From playing with the lm command, it looks like it will allow for up to 7 digits, with a maximum of 4 decimal places, before going into scientific notation.

But to be honest, I'm not entirely sure, and there wasn't documentation on this.

a-torgovitsky commented 5 years ago

Output Sounds good. Making the output look like lm seems like the way to go, so maybe do it their way? I assume this is just a display issue...i.e. the user can get the full number from the output if they desire.

Failed bootstraps Yes but we shouldn't have anything "failing" from infeasibility Remember that the estimator is designed so that infeasibility is not possible (see https://github.com/jkcshea/IVMTE/issues/41#issuecomment-473340879)

So something is going on here that needs to be investigated

jkcshea commented 5 years ago

It turns out the failed bootstrap was simply because I did not fully undo all the code related to the Halton sequence. Once fully reverting to the original audit code, there are no longer failed bootstraps. So I also removed the bootstrap summary.

Given #41, the code now stops whenever a bootstrap fails, and spits out the error for the user.

Making the output look like lm seems like the way to go, so maybe do it their way? I assume this is just a display issue...i.e. the user can get the full number from the output if they desire.

And yes, that is indeed the case. So we print out much more concise output now, but the results are still saved in their full form.

a-torgovitsky commented 5 years ago

Perfect!