conroylau / lpinfer

lpinfer: An R Package for Inference in Linear Programs
GNU General Public License v3.0
3 stars 5 forks source link

Suggestions: Studentization #108

Closed nadavkun closed 3 years ago

nadavkun commented 3 years ago

Hi,

While wokring with the FSST command, I found that taking the square root of large matrices (over 1000x1000) can be a bottle neck. I found that using sqrtm function from the package pracma works a bit faster than the schur decomposition currently used. Also, when calculating the confidence intervals, using the invertci, with a list of given bootstraps estimates, it might be more efficient to just calculate the studentization once and use it again (in the in which d>=p).

conroylau commented 3 years ago

Thanks for the comments Nadav and also several suggestions from @a-torgovitsky!

The invertci procedure now reuses the studentization matrix obtained from the FSST procedure if it is provided and if d ≥ p.

For the matrix square root, users can pass the function they want to use via the sqrtm.method option. There are several requirements for the sqrtm.method option:

The default is to use the pracma::sqrtm function. Since we also allow users to pass other methods to sqrtm.method, we check if the matrix square root is correct by computing the Frobenius norm. If it is larger than the tolerance level (specified in the optional sqrtm.tol argument), then the matrix square root will be computed by the expm::sqrtm method, where a closed-form solution exists. The default tolerance level is chosen to be the same as that in the pracma::sqrtm function.

An example of using the sqrtm.method option is as follows:

rm(list = ls())
library(lpinfer)
source("./inst/example/dgp_mixedlogit.R")

set.seed(1)
dgp <- mixedlogit_dgp(dimw = 4, dimv = 16)
df <- mixedlogit_draw(dgp, n = 4000)
lpm <- lpmodel(A.obs = mixedlogit_Aobs(dgp),
               beta.obs = function(d) mixedlogit_betaobs(d, dgp),
               A.shp = rep(1, nrow(dgp$vdist)),
               beta.shp = 1,
               A.tgt = mixedlogit_Atgt_dfelast(dgp, w2eval = 1, eeval = -1))

set.seed(1)
fsst(data = df,
     lpmodel = lpm,
     beta.tgt = .25,
     R = 500,
     sqrtm.method = function(m) {
       pracma::sqrtm(m, kmax = 30)$B
     })

I will also update the relevant parts in the vignette shortly.