conroylau / lpinfer

lpinfer: An R Package for Inference in Linear Programs
GNU General Public License v3.0
3 stars 5 forks source link

New procedure: Cho and Russell #79

Closed a-torgovitsky closed 4 years ago

a-torgovitsky commented 4 years ago

Here is a description of the procedure: chorussell.pdf

It is based on this paper: https://arxiv.org/abs/1810.03180

Let me know if you have any questions!

conroylau commented 4 years ago

Thanks for sending the new procedure! I have a question for the case where we have bound(s) that are unbounded above/below. Suppose estbounds return a bound like [a, +Inf) based on the sample data, where a is a real number. Then both \hat{\theta} {ub} and \Delta are infinite. On the other hand, it seems that \hat{\theta}^b {ub} can be finite or infinite based on the bootstrap data. May I know how should I handle the terms that involve \hat{\theta} {ub} and \Delta that appears in the constraints? For instance, to find c{lb}, is it sufficient to consider values the values {\sqrt{n}(\hat{\theta}^b {lb} - \hat{\theta} {lb})}^B {b=1} and -Inf, because \hat{\theta}^b {lb} - \hat{\theta}_ {lb} - \Delta is always -Inf in this example? [I am leaving a space after the underscore because otherwise it turns the subscript into italic font].

On the other hand, am I correct that I can reject immediately and set the p-value as 0 if the LP/QP is not feasible?

Thanks!

a-torgovitsky commented 4 years ago

Then both \hat{\theta} {ub} and \Delta are infinite. On the other hand, it seems that \hat{\theta}^b {ub} can be finite or infinite based on the bootstrap data. May I know how should I handle the terms that involve \hat{\theta} {ub} and \Delta that appears in the constraints? For instance, to find c{lb}, is it sufficient to consider values the values {\sqrt{n}(\hat{\theta}^b {lb} - \hat{\theta} {lb})}^B {b=1} and -Inf, because \hat{\theta}^b {lb} - \hat{\theta}_ {lb} - \Delta is always -Inf in this example?

Yes, if I understand correctly what you are saying, in this example you would only have to think about c_{lb}. If the point estimate for the upper bound is +\infty, then the upper bound of the confidence interval is going to be infinite too!

On the other hand, am I correct that I can reject immediately and set the p-value as 0 if the LP/QP is not feasible?

What LP/QP are you referring to? Is it the one from estbounds? i.e. are you saying reject immediately if mincriterion is not zero? I do not want to do that and I didn't think there is any reason we need to. Is there something in the procedure that makes this not possible? I didn't see one. But I guess one corollary is that you need to let the user pass kappa and norm to this procedure as well since they get used in estbounds. (The CR paper is written without considering the possibility that mincriterion > 0, i.e. everything is as if estimate = FALSE in estbounds. But we still want to handle the case where mincriterion > 0, since this is common in applications.)

conroylau commented 4 years ago

Thanks for the comments and explanations!

Regarding the second point, yes the LP/QP that I was referring to is the case where estimate = FALSE in estbounds (the "true" problem). Sure I will let the users to pass kappa and norm to this new procedure. Do you think I should still keep the estimate option in this new procedure as in estbounds?

a-torgovitsky commented 4 years ago

Yes I think letting them pass estimate is also a good idea, although the default should be the same as in estbounds.

Also, if they pass estimate = FALSE, it is quite likely at least in some situations that there will be many warning messages (one for each bootstrap). So it would be good if there is a way to collect those instead of spamming the screen. At the end one could issue a warning that says how many of the bootstrap iterations needed to be estimated.

conroylau commented 4 years ago

Sure, no problem. I will include estimate in the new procedure and set the same defaults as in estbounds. I will also store the warning/error messages in df.error as in other procedures and issue one message at the end in case there is any warning messages. Thanks!

conroylau commented 4 years ago

By the way, I have another two questions regarding the procedure:

  1. On the first line in page 2 of your notes, may I check with you should the \hat{\theta} {ub} at the end of the line be \hat{\theta} {lb} instead?
  2. Regarding the optimization problem on page 1, may I check with you should the (1 - alpha)-confidence interval be CodeCogsEqn (10) instead of [c {lb}(\alpha), c {ub}{\alpha}]? The optimization problem seems to be the same as (3.26) - (3.28) of the paper although we are using different notations. But it seems that this is how they construct the bootstrap confidence interval according to my understanding.

Thanks!

a-torgovitsky commented 4 years ago

Yes on both accounts. Sorry, I wrote that too quickly! Just for the record, here is a revised version that makes those fixes. chorussell.pdf

conroylau commented 4 years ago

I see, thanks for the fixes!

conroylau commented 4 years ago

By the way, do you think I should also allow the parameter kappa to be a vector in this new procedure? Thanks!

a-torgovitsky commented 4 years ago

Yes absolutely

conroylau commented 4 years ago

Sure!

conroylau commented 4 years ago

Done! I have just added the new procedure together with the unit tests and example codes.

On the other hand, since this procedure directly builds a confidence interval instead of testing a single point, I have done the followings:

  1. Added an option ci in the chorussell procedure so that a p-value is returned if ci is FALSE and a confidence interval is constructed if ci is TRUE.
  2. When chorussell is passed to invertci, invertci will not use the bisection method to construct the confidence interval. 
Instead, it will use the chorussell procedure by setting ci equals to TRUE to construct the confidence interval.

May I know do you think the above set up makes sense? Thanks!

a-torgovitsky commented 4 years ago

Yes that makes sense