conroylau / lpinfer

lpinfer: An R Package for Inference in Linear Programs
GNU General Public License v3.0
3 stars 5 forks source link

Subsample requires `beta.obs` to return a variance matrix #92

Closed a-torgovitsky closed 4 years ago

a-torgovitsky commented 4 years ago
library("lpinfer")
source("~/R/x86_64-pc-linux-gnu-library/3.6/lpinfer/extdata/dgp_mixedlogit.R")

dgp <- mixedlogit_dgp()
set.seed(1)
data <- mixedlogit_draw(dgp, n = 2000)
lpm <- lpmodel(A.obs = mixedlogit_Aobs(dgp),
               beta.obs = function(d) mixedlogit_betaobs(d, dgp),
               A.shp = rep(1, nrow(dgp$vdist)),
               beta.shp = 1,
               A.tgt = mixedlogit_Atgt_dfelast(dgp, w2eval = 1, eeval = -1))

subsample(data = data, lpmodel = lpm, .5)

yields

Error: The output of 'beta.obs' in 'lpmodel' needs to be a list of two objects (one vector and one matrix).

Was there some reason for this? If the user does not pass a variance matrix, we should just bootstrap it for them. Note that this would be the standard nonparametric bootstrap (redraw same number of observations as in data, with replacement) that we use in the other procedures. It would not depend on the subsampling size or replacement options in subsample.

conroylau commented 4 years ago

I see. I will add the part to compute the bootstrap variance matrix in case it is not provided by the user. Thanks!

conroylau commented 4 years ago

In the design of the subsample function, the variance matrix is re-estimated in each subsample. If the variance matrix is not provided by the user, should I still do the same? That is, may I know am I correct if I do the followings in each subsampling procedure?

Thanks!

a-torgovitsky commented 4 years ago

No I mean just estimate the variance matrix once at the beginning using the nonparametric bootstrap on the data that was passed. This can be used for all future subsamples.

On Mon, Sep 14, 2020, 8:46 PM conroylau notifications@github.com wrote:

In the design of the subsample function, the variance matrix is re-estimated in each subsample. If the variance matrix is not provided by the user, should I still do the same? That is, may I know am I correct if I do the followings in each subsampling procedure?

  • Obtain the subsample data data.bs based on the subsampling size and replacement option.
  • Estimate the variance matrix by a standard nonparametric bootstrap by redrawing the same number of observations in data.bs with replacement.

Thanks!

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/conroylau/lpinfer/issues/92#issuecomment-692410854, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEATCRB3IKD2EJM2UKCL7WTSF3BQHANCNFSM4RMHWRGA .

a-torgovitsky commented 4 years ago

Just to be clear, we should not do this, regardless of what the user passes:

In the design of the subsample function, the variance matrix is re-estimated in each subsample.

No matter what the user passes, we should just use a single variance matrix and stick with that throughout the resampling.

conroylau commented 4 years ago

Done! I have updated the subsample procedure so that:

Thanks!

a-torgovitsky commented 4 years ago

Looks good, thanks