beta.obs error handling

a-torgovitsky commented 4 years ago

This is related to the code in #53

It looks like that check is going to return the same error if the user passes an appropriate data object, but there is an error for another reason, won't it?

That seems hard to distinguish on our end. Also, it seems possible that the user could pass a function that is usually ok, but that for certain bootstrap draws it returns an error.

So, I think what we need to do is handle the errors every time we call these functions. If the function returns an error or warning just report to the user what happened. Then skip that bootstrap run and try a new one (a new draw of the data). Remember to put in a max number of redraws so that we don't get stuck in an infinite loop.

conroylau commented 4 years ago

I see. I got a few questions about this:

Should I add an additional input for the maximum number of iterations (and set it to default to perhaps 500)?
Am I correct that I should use a tryCatch function every time when the testing procedure is evaluating the data passed to the function? If there is an error, then I report the error/warning message that appears and try a new bootstrap run.
If the total number of successful bootstrap runs is less than R, should I (1) return an error, or (2) carry out the computations with R replaced by the total number of successful bootstrap runs?

Thanks!

a-torgovitsky commented 4 years ago

I see. I got a few questions about this:
1. Should I add an additional input for the maximum number of iterations (and set it to default to  perhaps 500)?
It probably makes more sense to make it a multiple of the bootstrap iterations the user put in, say 1.25x. This should be controllable by an option.
2. Am I correct that I should use a `tryCatch` function every time when the testing procedure is evaluating the data passed to the function? If there is an error, then I report the error/warning message that appears and try a new bootstrap run.
That seems appropriate, but you will have to play around with it to see if it makes sense. An error in the sample data should be fatal, as there is nothing else that can be drawn. A warning message for a bootstrap draw that fails seems appropriate.
3. If the total number of successful bootstrap runs is less than `R`, should I
   (1) return an error, or
   (2) carry out the computations with `R` replaced by the total number of successful bootstrap runs?
(2) is better. The number of successful bootstrap runs should be shown in summary. If it is < R, then it should also be shown as a warning.

Thanks!

conroylau commented 4 years ago

I have just updated dkqs code for beta.obs error handling in the bootstrap replications. The way I structure the error handling part is similar to that in subsample (as mentioned in issue #54).

The similarities are:

The tryCatch function covers the part that evaluates beta.obs from the bootstrap data.
The tryCatch function does not cover the part that draws the bootstrap data as there should be no problem from resampling the data.
The following information are added to the return list:
- The number of successful bootstrap draws.
- The number of failed bootstrap draws.
- A table indicating which bootstrap replications failed and the corresponding error messages. I also add a column indicating which tau failed.
Added a Rmulti parameter that represents the multiplier to indicate the maximum number of bootstrap replications.
Expanded the summary message.

However, the error handling workflow is slightly different from subsample because dkqs allows multiple tuning parameters (tau). In each of the bootstrap replications, there are two error handling parts:

Obtaining the bootstrapped beta.obs.
Obtaining the solution to the QP that corresponds to the cone-tightening procedure for a particular tau.

There are two cases to consider:

There is an error from obtaining the bootstrapped beta.obs: -> Record the error and go to the next bootstrap replication without evaluating the QP.
No error from obtaining the bootstrapped beta.obs: -> Evaluate the QP for different tau using the same bootstrapped beta.obs. -> If there is an error returned in the QP from one of the tau parameters, stop and record the id of the bootstrap replication, the problematic tau and the error messages. Discard the results computed for this beta.obs and go to the next bootstrap replication. The rationale for doing this is to ensure the p-values of the DKQS procedure is monotonic increasing by using the same beta.obs for different taus in the same bootstrap replication (as in issue #57).

The summary messages is also expanded by indicating the number of successful and failed bootstrap replications. Similar to the case for subsample (issue #54), I skip the line on the number of failed bootstrap replications if all of the bootstrap replications are successful. Below is a sample output:

 p-value: 0.2
 Maximum feasible tau: 0.76923
 Test statistic: 0.01746
 Solver used: gurobi
 Number of cores used: 8
 Number of successful bootstrap replications: 200

May I know do you think the above workflow is fine? Thanks!

a-torgovitsky commented 4 years ago

I think this looks good for now. We may want to refine it later.

Errors with beta.obs and with the solver tend to be distinct issues. For example, in my own testing (w/ my own Julia code), I have found some designs in which the optimization problems in fsst can occasionally fail to be solved using Gurobi's default settings, but can be solved if one changes say the method to use the primal simplex. For situations like these, we probably want to do more work trying to recover from the failure, say by iterating through different Gurobi settings until we find one that works, and only giving up if we can't find one. I anticipate that such a procedure would be useful as a post-solve diagnostic for all of the methods we are using.

No need to worry about that now. Soon I will try some of the problematic designs and see if I can replicate these types of errors in the R package. Then I will open an issue about that.

conroylau commented 4 years ago

I see, thanks! I will use a similar design of error handling in fsst by following what I did for subsample and dkqs for now.

Regarding the settings of Gurobi, would you want me to add an option in the functions that allows the users to pass a list of the parameters they want to specify in Gurobi? Right now, I have fixed the parameters in the Gurobi solver.

a-torgovitsky commented 4 years ago

Yes I think that would be useful, at least for us as a debugging measure. The annoying thing will be that for some methods (such as fsst) there are several different types of optimization problems, and each one might need different parameters. Not sure how best to handle that, but probably it should involve different sets of options.

conroylau commented 4 years ago

Sure, I will add that option into the testing functions. Indeed, I am working on what would be the most efficient way to handle the errors in beta.obs and the optimization problem in fsst. Since the sigma matrix is computed before the optimization problem, and it is used to obtain the studentization matrix, which will be used in the optimization problems, I am thinking a way to handle the potential errors where some beta.obs from the bootstrap replications can be computed but will cause problems in the LPs or QPs. I am also designing some unit tests to artificially create some problematic bootstrap draws to make sure the functions can catch these errors in a proper way.

a-torgovitsky commented 4 years ago

Great idea!

conroylau commented 4 years ago

I have just updated the error handling part for beta.obs in fsst for the non-parallel loops. As per issue #72, I am reading about the future package, so I did not update the error handling parts for the parallel functions in fsst yet as they might be removed at the end.

The error in fsst is handled in the following manner:

Compute the bootstrap replications of beta.obs. If there is any error in getting the beta.obs, drop those replications and compute the next bootstrap replications of beta.obs until the total number of successful replications is R or the maximum number of bootstrap replications allowed (based on Rmulti).
Based on the the bootstrap beta.obs obtained in step 1, compute the following terms (in the following sequence): a. Estimator of the asymptotic variance of beta.obs (if not already provided by the function) b. Standardization parameter and matrix c. Test statistics d. Bootstrap test statistics
In case any one of the parts in step 2 failed: a. Record the corresponding index. b. Drop them from beta.obs, test statistics etc. c. Compute the number of beta.obs being dropped. d. Go back to step 1 and compute the required number of bootstrap replications. e. Re-compute the terms in step 2.

In addition, I have added the number of successful/failed bootstrap replications in the summary message as in dkqs and subsample. Thanks!

conroylau commented 4 years ago

I have updated the error-handling parts in the tests. I mainly updates how the error messages are consolidated.

In addition, I have added a new set of unit tests that mainly does two things:

Checks the error-handling part for beta.obs.
Checks if the error message printed is correct (e.g. check if the error message printed when the tuning parameter is outside a certain range matches something that I designed).

The test file is called test_errors.R under the test folder.

I will add more tests as I continue to check the code. Thanks!

a-torgovitsky commented 4 years ago

Ok, sounds great!

Shall we leave this open or close it?

conroylau commented 4 years ago

I guess it can be closed for now!

conroylau / lpinfer

beta.obs error handling #54