Output and print - Githubissues

a-torgovitsky commented 4 years ago

It should be analogous to the way lm works.

For example, in RStudio,

lr <- lm(data = sampledata, Y ~ D)

This doesn't print anything

While this

lm(data = sampledata, Y ~ D

gives the same output as

lr <- lm(data = sampledata, Y ~ D)
print(lr)

Our functions should work the same way

a-torgovitsky commented 4 years ago

Oops, I see this is partly because I had `progress = TRUE' on

For dkqs_cone the `progress = TRUE' should just tell us how many bootstraps have been completed and how many are remaining.

a-torgovitsky commented 4 years ago

The progress indicator works for parallel_example but not for a single example with progress = TRUE. For example

library("linearprog")

func_full_info <- function(df){
  beta = NULL
  y_list = sort(unique(df[,"Y"]))
  n = dim(df)[1]
  yn = length(y_list)
  for (i in 1:yn){
    beta_i = sum((df[,"Y"] == y_list[i]) * (df[,"D"] == 1))/n
    beta = c(beta,c(beta_i))
  }
  beta = as.matrix(beta)
  return(beta)
}

func_two_moment <- function(df){
  beta = matrix(c(0,0), nrow = 2)
  n = dim(df)[1]
  beta[1] = sum(df[,"Y"] * df[,"D"])/n
  beta[2] = sum(df[,"D"])/n
  return(beta)
}

N = dim(sampledata)[1]
J1 = length(unique(sampledata[,"Y"]))
yp = seq(0,1,1/(J1-1))

A_obs_twom = matrix(c(rep(0,J1), yp, rep(0,J1), rep(1, J1)), nrow = 2,
                byrow = TRUE)
A_target = matrix(c(yp, yp), nrow = 1)
tau = sqrt(log(N)/N)

r <- dkqs(df = sampledata,
          A_obs = A_obs_twom,
          A_tgt = A_target,
          func_obs = func_two_moment,
          beta_tgt = 0.375,
          bs_seed = 1,
          bs_num = 10000,
          p_sig = 2,
          tau_input = tau,
          solver = "gurobi",
          cores = 8,
          progress = TRUE)

Not sure what the explaintion is...

EDIT Oh, I see, you said in the email that it is not implemented for parallel yet. Makes sense!

a-torgovitsky commented 4 years ago

This output doesn't look so good:

library("linearprog")

func_full_info <- function(df){
  beta = NULL
  y_list = sort(unique(df[,"Y"]))
  n = dim(df)[1]
  yn = length(y_list)
  for (i in 1:yn){
    beta_i = sum((df[,"Y"] == y_list[i]) * (df[,"D"] == 1))/n
    beta = c(beta,c(beta_i))
  }
  beta = as.matrix(beta)
  return(beta)
}

func_two_moment <- function(df){
  beta = matrix(c(0,0), nrow = 2)
  n = dim(df)[1]
  beta[1] = sum(df[,"Y"] * df[,"D"])/n
  beta[2] = sum(df[,"D"])/n
  return(beta)
}

N = dim(sampledata)[1]
J1 = length(unique(sampledata[,"Y"]))
yp = seq(0,1,1/(J1-1))

A_obs_twom = matrix(c(rep(0,J1), yp, rep(0,J1), rep(1, J1)), nrow = 2,
                byrow = TRUE)
A_target = matrix(c(yp, yp), nrow = 1)
tau = sqrt(log(N)/N)

dkqs_farg <- list(df = sampledata,
                  A_obs = A_obs_twom,
                  A_tgt = A_target,
                  func_obs = func_two_moment,
                  bs_seed = 1,
                  bs_num = 100,
                  p_sig = 2,
                  tau_input = tau,
                  solver = "gurobi",
                  cores = 1,
                  progress = FALSE)

invertci_dkqs <- invertci(f = dkqs,
                          farg = dkqs_farg,
                          alpha = 0.05,
                          lb0 = 0,
                          lb1 = 0.4,
                          ub0 = 1,
                          ub1 = 0.6,
                          tol = 0.001,
                          df_ci = NULL,
                          progress = TRUE)

Gives this:

< Constructing confidence interval for alpha = 0.05 >

=== Computing upper bound of confidence interval ===
Bootstrap completed!
Iteration    Test point      Lower bound     Upper bound     p-value     Decision
Left end-pt.     0.60000     0.60000     NA      0.78000     Do not reject  
Bootstrap completed!
Right end-pt.    1.00000     NA      1.00000     0.00000     Reject     
Bootstrap completed!
1        0.80000     0.60000     1.00000     0.00000     Reject     
Bootstrap completed!
2        0.70000     0.60000     0.80000     0.00000     Reject     
Bootstrap completed!
3        0.65000     0.60000     0.70000     0.05000     Do not reject  
Bootstrap completed!
4        0.67500     0.65000     0.70000     0.00000     Reject     
Bootstrap completed!
5        0.66250     0.65000     0.67500     0.00000     Reject     
Bootstrap completed!
6        0.65625     0.65000     0.66250     0.02000     Reject     
Bootstrap completed!
7        0.65312     0.65000     0.65625     0.03000     Do not reject  
Bootstrap completed!
8        0.65469     0.65312     0.65625     0.02000     Reject     
Bootstrap completed!
9        0.65391     0.65312     0.65469     0.02000     Reject     
Bootstrap completed!
>>> Length of interval is below tolerance level. Bisection method is completed.

=== Computing lower bound of confidence interval ===
Bootstrap completed!
Iteration    Test point      Lower bound     Upper bound     p-value     Decision
Left end-pt.     0.00000     0.00000     NA      0.00000     Reject     
Bootstrap completed!
Right end-pt.    0.40000     NA      0.40000     0.74000     Do not reject  
Bootstrap completed!
1        0.20000     0.00000     0.40000     0.00000     Reject     
Bootstrap completed!
2        0.30000     0.20000     0.40000     0.00000     Reject     
Bootstrap completed!
3        0.35000     0.30000     0.40000     0.00000     Reject     
Bootstrap completed!
4        0.37500     0.35000     0.40000     0.26000     Do not reject  
Bootstrap completed!
5        0.36250     0.35000     0.37500     0.03000     Do not reject  
Bootstrap completed!
6        0.35625     0.35000     0.36250     0.00000     Reject     
Bootstrap completed!
7        0.35938     0.35625     0.36250     0.01000     Reject     
Bootstrap completed!
8        0.36094     0.35938     0.36250     0.01000     Reject     
Bootstrap completed!
9        0.36172     0.36094     0.36250     0.01000     Reject     
Bootstrap completed!
>>> Length of interval is below tolerance level. Bisection method is completed.
-----------------------------------
< Significance level: 0.05 >
Tolerance level: 0.001.
Total number of iterations: 20.
Confidence interval: [0.362, 0.654].

It's nice to have the Progress x% while it is running, but we don't need Bootstrap completed! to stay there.

conroylau commented 4 years ago

Done for the non-parallel case! Now Progress x% while running and there won't be a completion message staying on the console.

Will continue to work on the parallel case. Thanks!

conroylau commented 4 years ago

Done - updated the code for both parallel and nonparallel case.

The following code for invertci:

func_full_info <- function(df){
    beta = NULL
    y_list = sort(unique(df[,"Y"]))
    n = dim(df)[1]
    yn = length(y_list)
    for (i in 1:yn){
        beta_i = sum((df[,"Y"] == y_list[i]) * (df[,"D"] == 1))/n
        beta = c(beta,c(beta_i))
    }
    beta = as.matrix(beta)
    return(beta)
}

func_two_moment <- function(df){
    beta = matrix(c(0,0), nrow = 2)
    n = dim(df)[1]
    beta[1] = sum(df[,"Y"] * df[,"D"])/n
    beta[2] = sum(df[,"D"])/n
    return(beta)
}

N = dim(sampledata)[1]
J1 = length(unique(sampledata[,"Y"]))
yp = seq(0,1,1/(J1-1))

A_obs_twom = matrix(c(rep(0,J1), yp, rep(0,J1), rep(1, J1)), nrow = 2,
                    byrow = TRUE)
A_target = matrix(c(yp, yp), nrow = 1)
tau = sqrt(log(N)/N)

dkqs_farg <- list(df = sampledata,
                  A_obs = A_obs_twom,
                  A_tgt = A_target,
                  func_obs = func_two_moment,
                  bs_seed = 1,
                  bs_num = 100,
                  p_sig = 2,
                  tau_input = tau,
                  solver = "gurobi",
                  cores = 1,
                  progress = FALSE)

invertci_dkqs <- invertci(f = dkqs,
                          farg = dkqs_farg,
                          alpha = 0.05,
                          lb0 = 0,
                          lb1 = 0.4,
                          ub0 = 1,
                          ub1 = 0.6,
                          tol = 0.001,
                          df_ci = NULL,
                          progress = TRUE)

gives the following output:

< Constructing confidence interval for alpha = 0.05 >

 === Computing upper bound of confidence interval ===
 Iteration   Lower bound     Upper bound     Test point      p-value     Reject?
 Left end pt.    0.60000     NA      0.60000     0.73000     FALSE  
 Right end pt.   NA      1.00000     1.00000     0.00000     TRUE   
 1       0.60000     1.00000     0.80000     0.00000     TRUE   
 2       0.60000     0.80000     0.70000     0.00000     TRUE   
 3       0.60000     0.70000     0.65000     0.05000     FALSE  
 4       0.65000     0.70000     0.67500     0.00000     TRUE   
 5       0.65000     0.67500     0.66250     0.00000     TRUE   
 6       0.65000     0.66250     0.65625     0.02000     TRUE   
 7       0.65000     0.65625     0.65312     0.03000     FALSE  
 8       0.65312     0.65625     0.65469     0.02000     TRUE   
 9       0.65312     0.65469     0.65391     0.02000     TRUE   
 >>> Length of interval is below tolerance level. Bisection method is completed.

 === Computing lower bound of confidence interval ===
 Iteration   Lower bound     Upper bound     Test point      p-value     Reject?
 Left end pt.    0.00000     NA      0.00000     0.00000     TRUE   
 Right end pt.   NA      0.40000     0.40000     0.79000     FALSE  
 1       0.00000     0.40000     0.20000     0.00000     TRUE   
 2       0.20000     0.40000     0.30000     0.00000     TRUE   
 3       0.30000     0.40000     0.35000     0.00000     TRUE   
 4       0.35000     0.40000     0.37500     0.26000     FALSE  
 5       0.35000     0.37500     0.36250     0.03000     FALSE  
 6       0.35000     0.36250     0.35625     0.00000     TRUE   
 7       0.35625     0.36250     0.35938     0.01000     TRUE   
 8       0.35938     0.36250     0.36094     0.01000     TRUE   
 9       0.36094     0.36250     0.36172     0.01000     TRUE   
 >>> Length of interval is below tolerance level. Bisection method is completed.

Applying the print command on invertci_dkqs, i.e.

print(invertci_dkqs)

gives

 < Significance level: 0.05 >
 Total number of iterations: 20.
 Confidence interval: [0.36211, 0.65352].

The following is a sample when the bootstrap procedure is running with progress in the list dkqs_farg being set to TRUE:

 < Constructing confidence interval for alpha = 0.05 >

 === Computing upper bound of confidence interval ===
 Iteration   Lower bound     Upper bound     Test point      p-value     Reject?
 Left end pt.    0.60000     NA      0.60000     0.73000     FALSE  
 Right end pt.   NA      1.00000     1.00000     0.00000     TRUE   
 1       0.60000     1.00000     0.80000     0.00000     TRUE   
  |=======             |  35%

Thank you!

conroylau / lpinfer

Output and print #2