stan-dev / stan

Stan development repository. The master branch contains the current release. The develop branch contains the latest stable development. See the Developer Process Wiki for details.
https://mc-stan.org
BSD 3-Clause "New" or "Revised" License
2.57k stars 368 forks source link

NaN error message for <lower=0> parameter #1308

Closed bob-carpenter closed 9 years ago

bob-carpenter commented 9 years ago

The following program and data produce errors like this and can't actually sample.

The final message is:

 No acceptably small step size could be found. Perhaps the posterior is not continuous?
error occurred during calling the sampler; sampling not done
Stan model 'heckman_hete' does not contain samples.

The interemdiate messages are

SAMPLING FOR MODEL 'heckman_hete' NOW (CHAIN 1).
sigma_eq=0.894267sigma_beta=[0.797875,0.359337,0.184746,2.56627,2.46375,0.196795,0.472598,0.611689,5.98179,1.11878,0.689196,0.300781,1.58627]sigma_eq=nanInformational Message: The current Metropolis proposal is about to be rejected because of the following issue:stan::prob::cauchy_log(N4stan5agrad3varE): Random variable is nan:0, but must not be nan!If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.sigma_eq=0.894267sigma_beta=[0.797875,0.359337,0.184746,2.56627,2.46375,0.196795,0.472598,0.611689,5.98179,1.11878,0.689196,0.300781,1.58627]sigma_eq=nanInformational Message: The current Metropolis proposal is about to be rejected because of the following issue:stan::prob::cauchy_log(N4stan5agrad3varE): Random variable is nan:0, but must not be nan!If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.sigma_eq=0.894267sigma_beta=[0.797875,0.359337,0.184746,2.56627,2.46375,0.196795,0.472598,0.611689,5.98179,1.11878,0.689196,0.300781,1.58627]sigma_eq=nan...
data {
  int<lower=0> N;           // number of customers
  int<lower=0> M;           // total number of observations
  int<lower=0> nB1;         // number of beta in equation 1
  int<lower=0> nB2;          // number of beta in equation 2
  int<lower=0> nB;
  int<lower=0> nZ;           // number of individual demographic variables plus intercept
  int<lower=0> ID[M];        // index for the observations
  matrix[N,nZ] Z;
  matrix[M,nB1] X1;
  matrix[M,nB2] X2;
  vector[M] Y1;
  vector[M] Y2;
}
parameters {

  matrix[nB,nZ] mu;                           // prior matrix for beta
  matrix[nB,N] error_beta;                    // a trick on the variance of beta

  // cholesky_factor_corr[nB] L_Omega_beta;   // the correlation matrix of the two equations
  vector<lower=0>[nB] sigma_beta;             // the s.t.d of the second equation 

  real<lower=-1,upper=1> rho_eq;              // the correlation matrix of the two equations
  real<lower=0> sigma_eq;                     // the s.t.d of the second equation 

}
transformed parameters {

  matrix[N,nB] beta;
  // beta <- Z*mu' + (diag_pre_multiply(sigma_beta,L_Omega_beta)*error_beta)';
  beta <- Z*mu' + (diag_matrix(sigma_beta)*error_beta)';

}
model{ 

  matrix[N,nB1] beta1;
  matrix[N,nB2] beta2;

  vector[M] lp;
  vector[M] value;

print("sigma_eq=",sigma_eq);
  sigma_eq ~ cauchy(0,2);                      // constrain it to be positive   

  //L_Omega_beta ~ lkj_corr_cholesky(2);       // correlation matrix of beta vector
print("sigma_beta=",sigma_beta);
  sigma_beta ~ cauchy(0,2);                    // s.t.d. of the covariance matrix (here cause problems)

  to_vector(mu) ~ normal(0,3); 
  to_vector(error_beta) ~ normal(0,1);

  // transform beta for two equations for each customer
  beta1 <- block(beta,1,1,N,nB1);
  beta2 <- block(beta,1,nB1+1,N,nB2);

  // calculate the likelihood
  for( m in 1:M){    
    if( Y1[m] == 0 ){
      lp[m] <- normal_cdf_log(-X1[m]*beta1[ID[m]]',0,1);
    }
    else{
      value[m] <- (X1[m]*beta1[ID[m]]'+rho_eq/sigma_eq*(Y2[m]-X2[m]*beta2[ID[m]]'))/(1-rho_eq^2)^0.5;
      lp[m] <- normal_cdf_log(value[m],0,1) - log(sigma_eq) - (Y2[m]-X2[m]*beta2[ID[m]]')^2/2/sigma_eq^2;
    }
  }

  increment_log_prob( log_sum_exp( lp ) );
}
generated quantities {

  vector[nB] m_beta;
  vector[nB] sd_beta; 

  // heterogeneity 
  for( i in 1:nB ){
    m_beta[i] <- mean( col(beta,i) );
    sd_beta[i] <- sd( col(beta,i) );
  }

}

And here's some R code to simulate and test

###################################################################################
# This code will try to run the homogeneous heckman model using a small
# sample of the customers. The mcmc will be conducted using stan.
# I will first use a simulated data based on the following parameterization 
# 
# Stan doesn't have the cdf of multivariate normal, so writing the likelihood 
# will not work 
#################################################################################

# version 2.6.0 for stan
# version 3.1.2 for R
require(rstan)  

library(MASS)

#----------------------------------------------------------------------------
#                                     simulate data
#----------------------------------------------------------------------------

# set-up for observations
N = 100;
nT = sample(10:30,N,replace=T);
M = sum(nT);
ID = rep(1:N,nT)     # ID must go from 1 to M 

# numbers of covariates for the two equations
nB1 = 5;
nB2 = 8;
nB = nB1 + nB2;

# covariance of the two equations
rho = 0.2;
sigma = 2;
Cov = matrix(c(1,rho*sigma,rho*sigma,sigma^2),2,2);

# demographics for hete
zC = matrix(c(2,0.6,0.6,2),2,2);
Z = as.matrix(cbind(1,mvrnorm(N,c(0,0),zC)));
nZ = ncol(Z);

#  prior for beta
mu = matrix(runif(nB*nZ,-1,1),nB,);
Lambda = diag(nB);

# covariates
X1 = cbind(1, matrix(rnorm(M*(nB1-1)),,nB1-1)); 
X2 = cbind(1, matrix(rnorm(M*(nB2-1)),,nB2-1)); 

# simulate actions and preferences
visit = array(0,M);
spend = array(0,M);
beta = matrix(0,N,nB);
for( i in 1:N ){
  beta[i,] = mvrnorm(1, mu%*%Z[i,],Lambda);
  beta1 = beta[1:nB1];
  beta2 = beta[(nB1+1):nB];
  for( t in 1:nT[i] ){
    if( i == 1){
      ind = t;
    }else{
      ind = sum(nT[1:(i-1)]) + t; 
    }   
    error = mvrnorm(1,c(0,0),Cov);
    u_it = X1[ind,]%*%beta1 + error[1];
    if( u_it >0 ){
      visit[ind] = 1;
      spend[ind] = X2[ind,]%*%beta2 + error[2] + 
        Cov[1,2]*dnorm(X1[ind,]%*%beta1,0,1)/pnorm(X1[ind,]%*%beta1,0,1);
    }    
  }  
}

#----------------------------------------------------------------------------------
#                                   define the model
#----------------------------------------------------------------------------------

#----------------------------------------------------------------------------------
#                                   run the model
#----------------------------------------------------------------------------------

#------------------
# initialization
#------------------         

data_list = list(N=N, M=M, nB1=nB1, nB2=nB2, nB=nB, nZ=nZ, ID=ID,
                 Z=Z, X1=X1, X2=X2, Y1=visit, Y2=spend);

#-------------------
#  mcmc
#-------------------

fit = stan("heckman_hete.stan",data=data_list,iter=10,chains=1)

#-------------------
#  analysis
#-------------------
print(fit, pars = c("m_beta", "sigma_eq","rho_eq"))
syclik commented 9 years ago

Bob, I'll try to hunt this down asap. It looks like a serious problem.

On Monday, February 23, 2015, Bob Carpenter notifications@github.com wrote:

The following program and data produce errors like this and can't actually sample.

The final message is:

No acceptably small step size could be found. Perhaps the posterior is not continuous? error occurred during calling the sampler; sampling not done Stan model 'heckman_hete' does not contain samples.

The interemdiate messages are

sigma_eq=0.894267sigma_beta=[0.797875,0.359337,0.184746,2.56627,2.46375,0.196795,0.472598,0.611689,5.98179,1.11878,0.689196,0.300781,1.58627]sigma_eq=nanInformational Message: The current Metropolis proposal is about to be rejected because of the following issue:stan::prob::cauchy_log(N4stan5agrad3varE): Random variable is nan:0, but must not be nan!If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,but if this warning occurs often then your model may be either severely ill-conditioned or

Note that the message needs to be smaller, because RStan (at least through the GUI) truncates. Also, there are issues with everything being run together and not having newlines.

data { int N; // number of customers int M; // total number of observations int nB1; // number of beta in equation 1 int nB2; // number of beta in equation 2 int nB; int nZ; // number of individual demographic variables plus intercept int ID[M]; // index for the observations matrix[N,nZ] Z; matrix[M,nB1] X1; matrix[M,nB2] X2; vector[M] Y1; vector[M] Y2; } parameters {

matrix[nB,nZ] mu; // prior matrix for beta matrix[nB,N] error_beta; // a trick on the variance of beta

// cholesky_factor_corr[nB] L_Omega_beta; // the correlation matrix of the two equations vector[nB] sigma_beta; // the s.t.d of the second equation

real rho_eq; // the correlation matrix of the two equations real sigma_eq; // the s.t.d of the second equation

} transformed parameters {

matrix[N,nB] beta; // beta <- Z_mu' + (diag_pre_multiply(sigma_beta,L_Omega_beta)_error_beta)'; beta <- Z_mu' + (diag_matrix(sigma_beta)_error_beta)';

} model{

matrix[N,nB1] beta1; matrix[N,nB2] beta2;

vector[M] lp; vector[M] value;

print("sigma_eq=",sigma_eq); sigma_eq ~ cauchy(0,2); // constrain it to be positive

//L_Omega_beta ~ lkj_corr_cholesky(2); // correlation matrix of beta vector print("sigma_beta=",sigma_beta); sigma_beta ~ cauchy(0,2); // s.t.d. of the covariance matrix (here cause problems)

to_vector(mu) ~ normal(0,3); to_vector(error_beta) ~ normal(0,1);

// transform beta for two equations for each customer beta1 <- block(beta,1,1,N,nB1); beta2 <- block(beta,1,nB1+1,N,nB2);

// calculate the likelihood for( m in 1:M){ if( Y1[m] == 0 ){ lp[m] <- normal_cdf_log(-X1[m]_beta1[ID[m]]',0,1); } else{ value[m] <- (X1[m]_beta1[ID[m]]'+rho_eq/sigmaeq(Y2[m]-X2[m]_beta2[ID[m]]'))/(1-rho_eq^2)^0.5; lp[m] <- normal_cdf_log(value[m],0,1) - log(sigma_eq) - (Y2[m]-X2[m]*beta2[ID[m]]')^2/2/sigma_eq^2; } }

increment_log_prob( log_sum_exp( lp ) ); } generated quantities {

vector[nB] m_beta; vector[nB] sd_beta;

// heterogeneity for( i in 1:nB ){ m_beta[i] <- mean( col(beta,i) ); sd_beta[i] <- sd( col(beta,i) ); }

}

And here's some R code to simulate and test

###################################################################################

This code will try to run the homogeneous heckman model using a small

sample of the customers. The mcmc will be conducted using stan.

I will first use a simulated data based on the following parameterization

#

Stan doesn't have the cdf of multivariate normal, so writing the likelihood

will not work

#################################################################################

version 2.6.0 for stan

version 3.1.2 for R

require(rstan)

library(MASS)

----------------------------------------------------------------------------

simulate data

----------------------------------------------------------------------------

set-up for observations

N = 100; nT = sample(10:30,N,replace=T); M = sum(nT); ID = rep(1:N,nT) # ID must go from 1 to M

numbers of covariates for the two equations

nB1 = 5; nB2 = 8; nB = nB1 + nB2;

covariance of the two equations

rho = 0.2; sigma = 2; Cov = matrix(c(1,rho_sigma,rho_sigma,sigma^2),2,2);

demographics for hete

zC = matrix(c(2,0.6,0.6,2),2,2); Z = as.matrix(cbind(1,mvrnorm(N,c(0,0),zC))); nZ = ncol(Z);

prior for beta

mu = matrix(runif(nB*nZ,-1,1),nB,); Lambda = diag(nB);

covariates

X1 = cbind(1, matrix(rnorm(M(nB1-1)),,nB1-1)); X2 = cbind(1, matrix(rnorm(M(nB2-1)),,nB2-1));

simulate actions and preferences

visit = array(0,M); spend = array(0,M); beta = matrix(0,N,nB); for( i in 1:N ){ beta[i,] = mvrnorm(1, mu%_%Z[i,],Lambda); beta1 = beta[1:nB1]; beta2 = beta[(nB1+1):nB]; for( t in 1:nT[i] ){ if( i == 1){ ind = t; }else{ ind = sum(nT[1:(i-1)]) + t; } error = mvrnorm(1,c(0,0),Cov); uit = X1[ind,]%%beta1 + error[1]; if( uit >0 ){ visit[ind] = 1; spend[ind] = X2[ind,]%%beta2 + error[2] + Cov[1,2]dnorm(X1[ind,]%%beta1,0,1)/pnorm(X1[ind,]%_%beta1,0,1); } } }

----------------------------------------------------------------------------------

define the model

----------------------------------------------------------------------------------

----------------------------------------------------------------------------------

run the model

----------------------------------------------------------------------------------

------------------

initialization

------------------

data_list = list(N=N, M=M, nB1=nB1, nB2=nB2, nB=nB, nZ=nZ, ID=ID, Z=Z, X1=X1, X2=X2, Y1=visit, Y2=spend);

-------------------

mcmc

-------------------

fit = stan("heckman_hete.stan",data=data_list,iter=10,chains=1)

-------------------

analysis

-------------------

print(fit, pars = c("m_beta", "sigma_eq","rho_eq"))

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/stan/issues/1308.

bob-carpenter commented 9 years ago

The model doesn't have problems with init=0, but I do think we want to track down where this problem can be coming from.

On Feb 23, 2015, at 4:10 PM, Daniel Lee notifications@github.com wrote:

Bob, I'll try to hunt this down asap. It looks like a serious problem.

On Monday, February 23, 2015, Bob Carpenter notifications@github.com wrote:

The following program and data produce errors like this and can't actually sample.

The final message is:

No acceptably small step size could be found. Perhaps the posterior is not continuous? error occurred during calling the sampler; sampling not done Stan model 'heckman_hete' does not contain samples.

The interemdiate messages are

sigma_eq=0.894267sigma_beta=[0.797875,0.359337,0.184746,2.56627,2.46375,0.196795,0.472598,0.611689,5.98179,1.11878,0.689196,0.300781,1.58627]sigma_eq=nanInformational Message: The current Metropolis proposal is about to be rejected because of the following issue:stan::prob::cauchy_log(N4stan5agrad3varE): Random variable is nan:0, but must not be nan!If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,but if this warning occurs often then your model may be either severely ill-conditioned or

Note that the message needs to be smaller, because RStan (at least through the GUI) truncates. Also, there are issues with everything being run together and not having newlines.

data { int N; // number of customers int M; // total number of observations int nB1; // number of beta in equation 1 int nB2; // number of beta in equation 2 int nB; int nZ; // number of individual demographic variables plus intercept int ID[M]; // index for the observations matrix[N,nZ] Z; matrix[M,nB1] X1; matrix[M,nB2] X2; vector[M] Y1; vector[M] Y2; } parameters {

matrix[nB,nZ] mu; // prior matrix for beta matrix[nB,N] error_beta; // a trick on the variance of beta

// cholesky_factor_corr[nB] L_Omega_beta; // the correlation matrix of the two equations vector[nB] sigma_beta; // the s.t.d of the second equation

real rho_eq; // the correlation matrix of the two equations real sigma_eq; // the s.t.d of the second equation

} transformed parameters {

matrix[N,nB] beta; // beta <- Z_mu' + (diag_pre_multiply(sigma_beta,L_Omega_beta)_error_beta)'; beta <- Z_mu' + (diag_matrix(sigma_beta)_error_beta)';

} model{

matrix[N,nB1] beta1; matrix[N,nB2] beta2;

vector[M] lp; vector[M] value;

print("sigma_eq=",sigma_eq); sigma_eq ~ cauchy(0,2); // constrain it to be positive

//L_Omega_beta ~ lkj_corr_cholesky(2); // correlation matrix of beta vector print("sigma_beta=",sigma_beta); sigma_beta ~ cauchy(0,2); // s.t.d. of the covariance matrix (here cause problems)

to_vector(mu) ~ normal(0,3); to_vector(error_beta) ~ normal(0,1);

// transform beta for two equations for each customer beta1 <- block(beta,1,1,N,nB1); beta2 <- block(beta,1,nB1+1,N,nB2);

// calculate the likelihood for( m in 1:M){ if( Y1[m] == 0 ){ lp[m] <- normal_cdf_log(-X1[m]_beta1[ID[m]]',0,1); } else{ value[m] <- (X1[m]_beta1[ID[m]]'+rho_eq/sigmaeq(Y2[m]-X2[m]_beta2[ID[m]]'))/(1-rho_eq^2)^0.5; lp[m] <- normal_cdf_log(value[m],0,1) - log(sigma_eq) - (Y2[m]-X2[m]*beta2[ID[m]]')^2/2/sigma_eq^2; } }

increment_log_prob( log_sum_exp( lp ) ); } generated quantities {

vector[nB] m_beta; vector[nB] sd_beta;

// heterogeneity for( i in 1:nB ){ m_beta[i] <- mean( col(beta,i) ); sd_beta[i] <- sd( col(beta,i) ); }

}

And here's some R code to simulate and test

###################################################################################

This code will try to run the homogeneous heckman model using a small

sample of the customers. The mcmc will be conducted using stan.

I will first use a simulated data based on the following parameterization

#

Stan doesn't have the cdf of multivariate normal, so writing the likelihood

will not work

#################################################################################

version 2.6.0 for stan

version 3.1.2 for R

require(rstan)

library(MASS)

----------------------------------------------------------------------------

simulate data

----------------------------------------------------------------------------

set-up for observations

N = 100; nT = sample(10:30,N,replace=T); M = sum(nT); ID = rep(1:N,nT) # ID must go from 1 to M

numbers of covariates for the two equations

nB1 = 5; nB2 = 8; nB = nB1 + nB2;

covariance of the two equations

rho = 0.2; sigma = 2; Cov = matrix(c(1,rho_sigma,rho_sigma,sigma^2),2,2);

demographics for hete

zC = matrix(c(2,0.6,0.6,2),2,2); Z = as.matrix(cbind(1,mvrnorm(N,c(0,0),zC))); nZ = ncol(Z);

prior for beta

mu = matrix(runif(nB*nZ,-1,1),nB,); Lambda = diag(nB);

covariates

X1 = cbind(1, matrix(rnorm(M(nB1-1)),,nB1-1)); X2 = cbind(1, matrix(rnorm(M(nB2-1)),,nB2-1));

simulate actions and preferences

visit = array(0,M); spend = array(0,M); beta = matrix(0,N,nB); for( i in 1:N ){ beta[i,] = mvrnorm(1, mu%_%Z[i,],Lambda); beta1 = beta[1:nB1]; beta2 = beta[(nB1+1):nB]; for( t in 1:nT[i] ){ if( i == 1){ ind = t; }else{ ind = sum(nT[1:(i-1)]) + t; } error = mvrnorm(1,c(0,0),Cov); uit = X1[ind,]%%beta1 + error[1]; if( uit >0 ){ visit[ind] = 1; spend[ind] = X2[ind,]%%beta2 + error[2] + Cov[1,2]dnorm(X1[ind,]%%beta1,0,1)/pnorm(X1[ind,]%_%beta1,0,1); } } }

----------------------------------------------------------------------------------

define the model

----------------------------------------------------------------------------------

----------------------------------------------------------------------------------

run the model

----------------------------------------------------------------------------------

------------------

initialization

------------------

data_list = list(N=N, M=M, nB1=nB1, nB2=nB2, nB=nB, nZ=nZ, ID=ID, Z=Z, X1=X1, X2=X2, Y1=visit, Y2=spend);

-------------------

mcmc

-------------------

fit = stan("heckman_hete.stan",data=data_list,iter=10,chains=1)

-------------------

analysis

-------------------

print(fit, pars = c("m_beta", "sigma_eq","rho_eq"))

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/stan/issues/1308.

— Reply to this email directly or view it on GitHub.

syclik commented 9 years ago

Weird. I couldn't reproduce it using current develop, but I do see it with v2.6.0.

syclik commented 9 years ago

It's seed dependent. In CmdStan, random seed=5 will trigger the bug.

syclik commented 9 years ago

with the data set I have...

jl3631 commented 9 years ago

Hey, I am the user who posted this problem on Stan group. After setting a higher acceptance rate and lowering the stepsize, the program is no longer giving me error. But the consistent issue is that, as Bob mentioned in the other post, the effective sample size is really low and the standard error of the estimates are really wide. That suggests that the model can be poorly identified. I also switched to estimate a simpler version with a 2-segment mixture of normal (rather than complete heterogeneity at individual level). Still, it has the same issue of large s.t.d. and low effective size. Overall, it seems that the model can't be fit well in stan :( ..... Any idea why?

bob-carpenter commented 9 years ago

It looks like a bug, but we haven't been able to track it down yet.

As to making the model more efficient, check out the pairs() plots to see if there are any pathological correlations indicating non-identifiability.

std err for the mean estimate is posterior std-dev / sqrt(n_eff). You need a high enough n_eff to bring this down to where you want it. The absolute value of it doesn't really matter. It may remain high if std-dev is high, but that's largely just saying the posterior is very fat.

On Mar 1, 2015, at 2:14 PM, jl3631 notifications@github.com wrote:

Hey, I am the user who posted this problem on Stan group. After setting a higher acceptance rate and lowering the stepsize, the program is no longer giving me error. But the consistent issue is that, as Bob mentioned in the other post, the effective sample size is really low and the standard error of the estimates are really wide. That suggests that the model can be poorly identified. I also switched to estimate a simpler version with a 2-segment mixture of normal (rather than complete heterogeneity at individual level). Still, it has the same issue of large s.t.d. and low effective size. Overall, it seems that the model can't be fit well in stan :( ..... Any idea why?

— Reply to this email directly or view it on GitHub.

syclik commented 9 years ago

Yes, it's definitely a bug and I'm trying to track it down. Hopefully I can find it soon. Thanks for helping explain the issue.

On Saturday, February 28, 2015, Bob Carpenter notifications@github.com wrote:

It looks like a bug, but we haven't been able to track it down yet.

As to making the model more efficient, check out the pairs() plots to see if there are any pathological correlations indicating non-identifiability.

std err for the mean estimate is posterior std-dev / sqrt(n_eff). You need a high enough n_eff to bring this down to where you want it. The absolute value of it doesn't really matter. It may remain high if std-dev is high, but that's largely just saying the posterior is very fat.

  • Bob

On Mar 1, 2015, at 2:14 PM, jl3631 <notifications@github.com javascript:_e(%7B%7D,'cvml','notifications@github.com');> wrote:

Hey, I am the user who posted this problem on Stan group. After setting a higher acceptance rate and lowering the stepsize, the program is no longer giving me error. But the consistent issue is that, as Bob mentioned in the other post, the effective sample size is really low and the standard error of the estimates are really wide. That suggests that the model can be poorly identified. I also switched to estimate a simpler version with a 2-segment mixture of normal (rather than complete heterogeneity at individual level). Still, it has the same issue of large s.t.d. and low effective size. Overall, it seems that the model can't be fit well in stan :( ..... Any idea why?

— Reply to this email directly or view it on GitHub.

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/stan/issues/1308#issuecomment-76572595.

syclik commented 9 years ago

Here's a smaller data set (I called it heckman_hete.data.R)

I ran diagnose mode with random seed=5 and I see NaNs for the derivatives. (./heckman_hete diagnose data file=heckman_hete.data.R random seed=5)

ID <- 
c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 7,
7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8,
8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9, 9,
9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 10, 10, 10, 10, 10, 10,
10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10)
M <- 228
N <- 10
nB <- 5
nB1 <- 2
nB2 <- 3
nZ <- 3
X1 <- 
structure(c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0.387671611559369,
-0.0538050405829051, -1.37705955682861, -0.41499456329968, -0.394289953710349,
-0.0593133967111857, 1.10002537198388, 0.763175748457544, -0.164523596253587,
-0.253361680136508, 0.696963375404737, 0.556663198673657, -0.68875569454952,
-0.70749515696212, 0.36458196213683, 0.768532924515416, -0.112346212150228,
0.881107726454215, 0.398105880367068, -0.612026393250771, 0.341119691424425,
-1.12936309608079, 1.43302370170104, 1.98039989850586, -0.367221476466509,
-1.04413462631653, 0.569719627442413, -0.135054603880824, 2.40161776050478,
-0.0392400027331692, 0.689739362450777, 0.0280021587806661, -0.743273208882405,
0.188792299514343, -1.80495862889104, 1.46555486156289, 0.153253338211898,
2.17261167036215, 0.475509528899663, -0.709946430921815, 0.610726353489055,
-0.934097631644252, -1.2536334002391, 0.291446235517463, -0.443291873218433,
0.00110535163162413, 0.0743413241516641, -0.589520946188072, -0.568668732818502,
-0.135178615123832, 1.1780869965732, -1.52356680042976, 0.593946187628422,
0.332950371213518, 1.06309983727636, -0.304183923634301, 0.370018809916288,
0.267098790772231, -0.54252003099165, 1.20786780598317, 1.16040261569495,
0.700213649514998, 1.58683345454085, 0.558486425565304, -1.27659220845804,
-0.573265414236886, -1.22461261489836, -0.473400636439312, -0.620366677224124,
0.0421158731442352, -0.910921648552446, 0.158028772404075, -0.654584643918818,
1.76728726937265, 0.716707476017206, 0.910174229495227, 0.384185357826345,
1.68217608051942, -0.635736453948977, -0.461644730360566, 1.43228223854166,
-0.508055079939179, -0.207380743601965, -0.392807929441984, -0.319992868548507,
-0.279113302976559, 0.494188331267827, -0.177330482269606, -0.505957462114257,
1.34303882517041, -0.214579408546869, -0.179556530043387, -0.100190741213562,
0.712666307051405, -0.0735644041263263, -0.0376341714670479, -0.681660478755657,
-0.324270272246319, 0.0601604404345152, -0.588894486259664, 0.531496192632572,
-1.51839408178679, 0.306557860789766, -1.53644982353759, -0.300976126836611,
-0.528279904445006, -0.652094780680999, -0.0568967778473925, -1.91435942568001,
1.17658331201856, -1.664972436212, -0.463530401472386, -1.11592010504285,
-0.750819001193448, 2.08716654562835, 0.0173956196932517, -1.28630053043433,
-1.64060553441858, 0.450187101272656, -0.018559832714638, -0.318068374543844,
-0.929362147453702, -1.48746031014148, -1.07519229661568, 1.00002880371391,
-0.621266694796823, -1.38442684738449, 1.86929062242358, 0.425100377372448,
-0.238647100913033, 1.05848304870902, 0.886422651374936, -0.619243048231147,
2.20610246454047, -0.255027030141015, -1.42449465021281, -0.144399601954219,
0.207538339232345, 2.30797839905936, 0.105802367893711, 0.456998805423414,
-0.077152935356531, -0.334000842366544, -0.0347260283112762, 0.787639605630162,
2.07524500865228, 1.02739243876377, 1.2079083983867, -1.23132342155804,
0.983895570053379, 0.219924803660651, -1.46725002909224, 0.521022742648139,
-0.158754604716016, 1.4645873119698, -0.766081999604665, -0.430211753928547,
-0.926109497377437, -0.17710396143654, 0.402011779486338, -0.731748173119606,
0.830373167981674, -1.20808278630446, -1.04798441280774, 1.44115770684428,
-1.01584746530465, 0.411974712317515, -0.38107605110892, 0.409401839650934,
1.68887328620405, 1.58658843344197, -0.330907800682766, -2.28523553529247,
2.49766158983416, 0.667066166765493, 0.5413273359637, -0.0133995231459087,
0.510108422952926, -0.164375831769667, 0.420694643254513, -0.400246743977644,
-1.37020787754746, 0.987838267454879, 1.51974502549955, -0.308740569225614,
-1.25328975560769, 0.642241305677824, -0.0447091368939791, -1.73321840682484,
0.00213185968026965, -0.630300333928146, -0.340968579860405, -1.15657236263585,
1.80314190791747, -0.331132036964496, -1.60551341225308, 0.197193438739481,
0.263175646405474, -0.985826700409291, -2.88892067167955, -0.640481702565115,
0.570507635920485, -0.05972327604261, -0.0981787440052344, 0.560820728620116,
-1.18645863857947, 1.09677704427424, -0.00534402827816569, 0.707310667398079,
1.03410773473746, 0.223480414915304, -0.878707612866019, 1.16296455596733,
-2.00016494478548, -0.544790740001725, -0.255670709156989, -0.166121036765006,
1.02046390878411, 0.136221893102778, 0.407167603423836, -0.0696548130129049,
-0.247664341619331, 0.69555080661964, 1.1462283572158, -2.40309621489187,
0.572739555245841, 0.374724406778655, -0.425267721556076),
.Dim = c(228, 2))
X2 <- 
structure(c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0.951012807576816,
-0.389237181718379, -0.284330661799574, 0.857409778079803, 1.7196272991206,
0.270054900937229, -0.42218400978764, -1.18911329485959, -0.33103297887901,
-0.939829326510021, -0.258932583118785, 0.394379168221572, -0.851857092023863,
2.64916688109488, 0.156011675665079, 1.13020726745494, -2.28912397984011,
0.741001157195439, -1.31624516045156, 0.919803677609141, 0.398130155451956,
-0.407528579269772, 1.32425863017727, -0.70123166924692, -0.580614304240536,
-1.00107218102542, -0.668178606753393, 0.945184953373082, 0.433702149545162,
1.00515921767704, -0.390118664053679, 0.376370291774648, 0.244164924486494,
-1.42625734238254, 1.77842928747545, 0.134447660933676, 0.765598999157864,
0.955136676908982, -0.0505657014422701, -0.305815419766971, 0.893673702425513,
-1.0472981490613, 1.97133738622415, -0.383632106288362, 1.65414530227499,
1.51221269395063, 0.082965733585914, 0.5672209148515, -1.02454847953446,
0.323006503022436, 1.04361245835618, 0.733090637454028, -0.45413690915436,
-0.65578185245044, -0.0359224226225114, 1.0691614606768, -0.483974930301277,
-0.121010111327444, -1.29414000382084, 0.494312836014856, 1.30790152011415,
1.49704100940278, 0.814702730897356, -1.86978879020261, 0.48202950412376,
0.456135603301202, -0.353400285829911, 0.170489470947982, -0.864035954126904,
0.679230774015656, -0.327101014653104, -1.56908218514395, -0.367450756170482,
1.36443492906985, -0.334281364731164, 0.732750042209102, 0.946585640197786,
0.00439870432637403, -0.352322305549849, -0.529695509133504, 0.811483733148644,
-1.06345741548281, 0.2462108435364, -0.289499366564312, -2.26488935648794,
-1.40885045607319, 0.916019328792574, -0.1912789505352, 0.803283216133648,
1.8874744633086, 1.47388118110914, 0.677268492312998, 0.379962686566745,
-0.192798426457334, 1.5778917949044, 0.596234109318454, -1.17357694087136,
-0.155642534890318, -1.91890982026984, -0.195258846110637, -2.59232766994599,
1.31400216719805, -0.635543001032135, -0.429978838694188, -0.169318332301963,
0.612218173989128, 0.6783401772227, 0.567951972471672, -0.572542603926126,
-1.36329125627834, -0.388722244337901, 0.277914132450543, -0.823081121572025,
-0.0688409344784646, -1.1676623261298, -0.0083090142156068, 0.128855401597405,
-0.145875628461003, -0.163910956736068, 1.76355200278492, 0.762586512418318,
1.11143108073063, -0.923206952829831, 0.164341838427956, 1.15482518709727,
-0.0565214245264898, -2.12936064823465, 0.344845762099456, -1.90495544558553,
-0.811170153140217, 1.3240043212996, 0.615636849302674, 1.09166895553536,
0.306604861513632, -0.110158762482857, -0.924312773127284, 1.5929137537192,
0.0450105981218803, -0.715128400667883, 0.865223099717138, 1.0744409582779,
1.89565477419858, -0.602997303605094, -0.390867820723137, -0.416222031530192,
-0.375657422820391, -0.366630945702358, -0.295677452700886, 1.44182041019987,
-0.697538291913041, -0.38816750592136, 0.652536452171517, 1.12477244653513,
-0.772110803023928, -0.508086216138287, 0.523620590498017, 1.01775422653797,
-0.251164588087978, -1.42999344738861, 1.70912103210088, 1.43506957231162,
-0.71037114584988, -0.0650675736555838, -1.75946873536593, 0.569722971819101,
1.61234679820178, -1.63728064710472, -0.779568513201747, -0.641176933750564,
-0.681131393571156, -2.03328559561795, 0.500963559247907, -1.53179813996574,
-0.0249976392782635, 0.592984721025293, -0.19819542148464, 0.892008392473526,
-0.0257150709181537, -0.647660450585665, 0.64635941503464, -0.433832740029588,
1.7726111849785, -0.0182597111630101, 0.852814993600531, 0.205162903244026,
-3.00804859892048, -1.36611193132898, -0.424102260144812, 0.236803663745558,
-2.34272312035896, 0.961696633380712, -0.60442573385444, -0.752877279362913,
-1.55561159145999, -1.45389373800106, 0.0563318360544103, 0.509369406673892,
-2.0978829597843, -1.00436197944933, 0.535771722252592, -0.453037084649918,
2.16536850181726, 1.24574667274811, 0.595498034069596, 0.0048844495381889,
0.279360782055259, -0.705906125267626, 0.628017152957251, 1.4802139600857,
1.08342991008328, -0.813244256664091, -1.61887684922359, -0.109655699488241,
0.440889370916978, 1.3509939798086, -1.31860948453356, 0.364384592607448,
0.233499835088518, 1.19395526125905, -0.0279099721295774, -0.357298854504253,
-1.14681413611837, -0.517420483593623, -0.362123772578272, 2.35055432578911,
2.44653137569996, -0.166703279484276, -1.04366743906189, -1.97293493407467,
0.514671633438029, -1.09057358373486, 2.28465932554436, -0.885617572657959,
0.111106429745722, 3.81027668071067, -1.10890999794325, 0.307566624421454,
-1.10689447222516, 0.34765364882196, -0.873264535062044, 0.0773031227369236,
-0.296868642156621, -1.18324224043267, 0.0112926884238996, 0.991601036059367,
1.59396745390374, -1.37271127092917, -0.249610933042319, 1.15942452685555,
-1.1142223478436, -2.52850068885039, -0.935902558513312, -0.967239457935565,
0.0474885923277959, -0.403736793498034, 0.231496128244633, -0.422372407926746,
0.374118394695298, -0.366005774997015, 1.19010144652106, -0.737327525257229,
0.290666645439266, -0.884849568284455, 0.208006478870832, -0.0477301724785263,
-1.68452064613859, -0.144226556618951, 1.18021366550265, 0.681399923199504,
0.143247630887551, -1.19231644371292, 1.16922865278909, 0.0792017089453723,
-0.451773752768677, 1.64202821280077, -0.769592321599518, 0.303360960757848,
1.28173742118351, 1.8248100330388, -0.307022264536836, -0.418418103422641,
0.355135530046085, 0.513481114599557, 0.01860740032874, 1.31844897225668,
-0.065831999838153, -0.700296078317704, 0.537326131521341, -2.20178232235392,
0.391973743798411, 0.496960952423719, -0.224874715429902, -1.11714316533188,
-0.394994603313877, 1.54983034223631, -0.743514479798666, -2.33171211772957,
0.812245442209214, -0.501310657201351, -0.510886565952357, -1.21536404115501,
-0.022558628347222, 0.701239300429707, -0.587482025589799, -0.60672794141879,
1.09664021500029, -0.247509677080296, -0.318127089839435, -0.625778250735075,
0.900434635600238, -0.994193629266225, 0.849250385880362, 0.805702288976437,
-0.46760093599122, 0.848420313723343, 0.986769863596017, 0.57562028851936,
2.02484204541817, -1.96235319122504, -1.16492093063759, -1.37651921373083,
0.167679934469767, 1.58462907915972, 1.67788895297625, 0.488296698394955,
0.878673262560324, -0.144874874029881, 0.468971760074208, 0.376235477129484,
-0.761040275056481, -0.293294933750864, -0.134841264407518, 1.39384581617302,
-1.03698868969829, -2.11433514780366, 0.768278218204398, -0.816160620753485,
-0.436106923160574, 0.904705031282809, -0.763086264548317, -0.341066979637804,
1.50242453423459, 0.5283077123164, 0.542191355464308, -0.136673355717877,
-1.13673385330273, -1.49662715435579, -0.223385643556595, 2.00171922777288,
0.221703816213622, 0.164372909246388, 0.33262360885001, -0.385207999091431,
-1.39875402655789, 2.67574079585182, -0.423686088681983, -0.298601511933064,
-1.79234172685298, -0.248008225068098, -0.247303918374605, -0.255510378526346,
-1.78693810009257, 1.78466277981464, 1.76358634774588, 0.689600221934096,
-1.10074064442392, 0.714509356897811, -0.246470316934021, -0.319786165927205,
1.3626442929547, -1.22788258998268, -0.511219232750473, -0.731194999064964,
0.0197520068767907, -1.57286391470999, -0.703333269828288, 0.715932088907665,
0.465214906423109, -0.973902306404292, 0.559217730468333, -2.43263974540567,
-0.340484926702145, 0.713033194850233, -0.659037386489128, -0.0364026225664853,
-1.59328630152691, 0.847792797247769, -1.85038884866061, -0.323650631690314,
-0.255248112503172, 0.0609212273067499, -0.823491629018878, 1.82973048483604,
-1.4299162157815, 0.254137143004585, -2.93977369533598, 0.00241580894347513,
0.509665571261936, -1.08472000111396, 0.704832976877566, 0.330976349994212,
0.976327472976372, -0.843339880398736, -0.970579904862913, -1.77153134864293,
-0.322470341645949, -1.3388007423545, 0.688156028146055, 0.0712806522505788,
2.18975235927918, -1.15770759930148, 1.18168806409233, -0.527368361608178,
-1.45662801131076, 0.572967370493168, -1.43337770467221, -1.0551850185265,
-0.733111877421952, 0.210907264251298, -0.99892072706101, 1.07785032311666,
-1.19897438250787, 0.216637035184711, 0.143087029788927, -1.06575009105298,
-0.428623410701779, -0.656179476771346, 0.959394326857613, 1.55605263610222,
-1.04079643378825, 0.930572408540247, -0.0754459310121911, -1.96719534905096,
-0.755903642645193, 0.461149160506762, 0.145106631047872, -2.44231132108142,
0.580318685451573, 0.655051998486745, -0.304508837259419, -0.707568232617297,
1.97157201447823, -0.0899986806477491, -0.0140172519319242, -1.12345693711015,
-1.34413012337621, -1.52315577106209, -0.421968210382174, 1.36092446408098,
1.75379484533487, 1.56836473435279, 1.29675556994775, -0.237596251443612,
-1.22415013568071, -0.327812680140072, -2.41245027627225),
.Dim = c(228, 3))
Y1 <- 
structure(c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1,
1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0,
1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 1, 1, 1,
1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0,
0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1,
1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0),
.Dim = c(228))
Y2 <- 
structure(c(-1.32672130166661, -0.272397491741742, -0.043585251511939,
1.79499202209274, -0.762466734417127, 1.74191728763526, -1.90259071740488,
-0.820711294833164, -0.237866962537207, 0.712587944475529, -2.49569640090772,
1.26396350672239, -2.54800712915759, -1.38496510992727, 3.35945290568345,
-1.83863292464889, 1.17609896946599, 0, 2.11872359285194, -1.31929237116142,
1.08227344050472, -0.218344251544973, 1.31129109261994, 2.58456222114515,
-2.25606159874727, -3.19792906136904, 2.82750136562783, -1.81292038900523,
-1.63207316597761, 1.62003366092021, 0.780549521505916, 1.2551569007175, 0,
0.442181008379333, 0, -0.176746461422043, -5.03160309162317, -2.40043216735041,
1.98519501283647, -1.01623231820833, -0.724172427835045, 1.58464897284282, 0,
-2.58970773732048, 2.0722820062602, 2.99500277070937, 1.22097096965757,
1.34361056326913, 5.67382750848485, 0.946846979438166, 4.75265726818859, 0,
3.64085379698848, 1.50365817158466, -0.677749973277863, 0, 0.452923129060468,
1.4651484027212, 1.5085048533847, 1.79433909513694, -1.01903035095818,
-0.261497331307764, 2.49544378903178, -0.019351215005303, 0, 0.698121926116175,
0, 4.65345982193844, 0.153106468137252, -0.156258476710303, 0, 2.1082148284838,
0, 0.511367841775971, 3.55877189984535, -0.889036449205811, -0.954806961748562,
1.43918758813355, 4.00252969361112, 2.59959733689882, -0.922927371914307,
-2.77103826345727, 1.51490587626788, -0.391718208357271, -2.12059690637253,
1.57435252612826, -0.862247734707109, -2.69693933301966, 0.817123538476595,
-0.687211847430289, 10.2393603687958, 1.57807061293222, 2.43207791287309,
0.407472681772658, -2.65173215485659, -1.57259225805081, -4.47057126987253,
3.24691028660409, -0.759661328827953, -1.82261164792581, 2.33656204680281, 0,
3.76165789457243, 0, -1.29440625552177, 3.40985057782188, 6.74417162638312,
-0.612994195711007, 0, -2.02000275969661, -0.486551521249623, 4.39391100633023,
0, -4.51559131228893, -6.0472317782329, 0.494702342133359, 0, 0,
3.98375853949426, 2.00591916856819, -5.84137531510099, 0, 0, 0,
1.85185333417557, 6.84568664969698, 0, -1.22051010654614, -1.6043035133555,
7.76165298802623, 0.614502573481057, -1.38458370994503, 1.07344578736609,
2.64766432110622, 0.617216294088571, 0, 0.381355751841739, 1.17109736031794,
0.781523874575068, 2.37736181986408, 0.614461952580383, -3.32633126836561,
5.24430768879886, 4.39778840924808, 3.0523232721201, 1.5287670272617,
3.50865212662069, 4.75107318341444, 0, 0.379653636625668, 3.92918081955701, 0,
6.48036983549671, 4.31078918511156, -3.6597191454401, 2.00249390516991,
-0.715652024490913, 7.08008749999505, -0.861573385301927, 8.48891484001895,
-1.81142760751555, 2.99580374102964, 0, 0, -3.79453096161545, 7.02759918507819,
2.24446406921315, 12.4023559004377, 0.547429264601984, -0.560594595326143,
7.91690921059814, -2.33292687663183, 0, 1.51686195496685, 3.39659247113866,
1.42149807394497, 5.21475739365423, 2.09049069730701, 5.36937223752404,
-3.61842506908661, -0.581607507599348, -4.7388551171322, 1.42017900477619,
-7.23276694437925, 0, 0, -2.76253007575423, 6.94283531356559, 0,
2.32979488591262, 0.976787480914012, 6.32159328285804, -2.01881159095577,
3.85991113080169, -0.381810726197835, 0, 1.22407662616061, 0.333910626322032, 0,
0, -0.83965616136632, 2.21541339535127, -4.13855617731385, 0.0888354767556315,
5.07252117876366, 0, -1.89176252907632, 4.84085547913948, 6.06035508539944,
1.03989922526731, 0.418384900454956, 6.64890250048004, 4.04731292121244, 0,
-2.66625809187912, 1.48166190723097, 0.427121471107529, 1.48903438083323,
0.540947483254646, 3.61314986716205, -2.70008605209449, -1.38667022567468,
-5.77894400062867, -1.78280549473563, 0, 5.88553814228852, 3.57877895161916, 0),
.Dim = c(228))
Z <- 
structure(c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, -1.41151852937362, -1.26975459036438,
0.410201743315375, -0.371094389438452, 3.77712562878605, 1.05826647400644,
-1.22676256130549, -1.42008752155136, -1.00287018939144, -0.293378768873428,
-2.10010785794438, -0.847704036710729, -1.08226776472504, 0.357943212160366,
1.70632777092834, 0.682994524311054, -0.595258858943701, -1.19697311519676,
0.342796243731764, -0.388936679615894),
.Dim = c(10, 3))
syclik commented 9 years ago

I've narrowed it a little further: it looks like the nans are introduced by the derivatives of normal_cdf_log() when the values push the values to the boundaries.

syclik commented 9 years ago

@betanalpha @bob-carpenter, there's at least one issue here. The call to gradient() within src/stan/mcmc/hmc/hamiltonians/base_hamiltonian.hpp within the update() method returns nan gradients. I think update() should stop at that point. Thoughts?

I'm still trying to track down what causes the nan to appear in the first place.

jl3631 commented 9 years ago

Thanks. I can circumvent this NA problem by giving the model a good starting value. Although it can run now, I find it is really difficult to raise the n_eff for this model (so far still around 5 out of 1000).

bob-carpenter commented 9 years ago

Yes, I think on getting a NaN gradient we should throw and reject (n_divergent++) rather than returning NaNs.

When you fix that, it shouldn't be an issue.

If it's being caused by a function returning NaNs, then we should take that function and have it throw exceptions if that's possible.

I was hoping this would be related to not catching errors in transformed parameters. We should fix that, too, if the error report was correct.

On Mar 7, 2015, at 12:24 AM, Daniel Lee notifications@github.com wrote:

@betanalpha @bob-carpenter, there's at least one issue here. The call to gradient() within src/stan/mcmc/hmc/hamiltonians/base_hamiltonian.hpp within the update() method returns nan gradients. I think update() should stop at that point. Thoughts?

I'm still trying to track down what causes the nan to appear in the first place.

— Reply to this email directly or view it on GitHub.

bob-carpenter commented 9 years ago

Is that with multiple chains or with a single chain? What do the traceplots and pair plots look like? That is, are you wandering all over in the traceplots with heavy correlation and do the pair plots show a lot of pairwise correlation among parameters?

We really need an option to print out all the posterior parameter correlations as part of print. That would be much faster for diagnosis than using the pairs plot. I actually nerfed RStudio during a demo by building a predictive model without editing out the call to pairs --- it couldn't handle 1000 parameters for obvious reasons.

On Mar 7, 2015, at 2:22 AM, jl3631 notifications@github.com wrote:

Thanks. I can circumvent this NA problem by giving the model a good starting value. Although it can run now, I find it is really difficult to raise the n_eff for this model (so far still around 5 out of 1000).

— Reply to this email directly or view it on GitHub.

andrewgelman commented 9 years ago

There is a standard graph showing the posterior correlations as little ellipses. If you’re only showing these little ellipses, they can be pretty tiny so you can show lots of corrs at once.

On Mar 7, 2015, at 12:13 AM, Bob Carpenter notifications@github.com wrote:

Is that with multiple chains or with a single chain? What do the traceplots and pair plots look like? That is, are you wandering all over in the traceplots with heavy correlation and do the pair plots show a lot of pairwise correlation among parameters?

We really need an option to print out all the posterior parameter correlations as part of print. That would be much faster for diagnosis than using the pairs plot. I actually nerfed RStudio during a demo by building a predictive model without editing out the call to pairs --- it couldn't handle 1000 parameters for obvious reasons.

  • Bob

On Mar 7, 2015, at 2:22 AM, jl3631 notifications@github.com wrote:

Thanks. I can circumvent this NA problem by giving the model a good starting value. Although it can run now, I find it is really difficult to raise the n_eff for this model (so far still around 5 out of 1000).

— Reply to this email directly or view it on GitHub.

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/stan/issues/1308#issuecomment-77674220.

jl3631 commented 9 years ago

I ran a slightly different model with two chains. Although the trace plots seem fine for all the parameters, the parameters beta that capture the continuous heterogeneity are heavily serial correlated. I also see strong correlation among a few parameter estimates.

From: Bob Carpenter notifications@github.com<mailto:notifications@github.com> Reply-To: stan-dev/stan reply@reply.github.com<mailto:reply@reply.github.com> Date: Saturday, March 7, 2015 at 12:13 AM To: stan-dev/stan stan@noreply.github.com<mailto:stan@noreply.github.com> Cc: Jia Liu jl3631@columbia.edu<mailto:jl3631@columbia.edu> Subject: Re: [stan] NaN error message for parameter (#1308)

Is that with multiple chains or with a single chain? What do the traceplots and pair plots look like? That is, are you wandering all over in the traceplots with heavy correlation and do the pair plots show a lot of pairwise correlation among parameters?

We really need an option to print out all the posterior parameter correlations as part of print. That would be much faster for diagnosis than using the pairs plot. I actually nerfed RStudio during a demo by building a predictive model without editing out the call to pairs --- it couldn't handle 1000 parameters for obvious reasons.

On Mar 7, 2015, at 2:22 AM, jl3631 notifications@github.com<mailto:notifications@github.com> wrote:

Thanks. I can circumvent this NA problem by giving the model a good starting value. Although it can run now, I find it is really difficult to raise the n_eff for this model (so far still around 5 out of 1000).

— Reply to this email directly or view it on GitHub.

— Reply to this email directly or view it on GitHubhttps://github.com/stan-dev/stan/issues/1308#issuecomment-77674220.

bgoodri commented 9 years ago

We can look into the FE_INVALID flag http://en.cppreference.com/w/cpp/header/cfenv but it was a bit different before the C++11 standard. But the basic idea would be to clear it immediately before the gradient is evaluated and check it immediately after.

bob-carpenter commented 9 years ago

Nice --- that's a great idea. We can always continue to ifdef our way to C++ bliss if we need something different for C++11.

On Mar 11, 2015, at 1:28 AM, bgoodri notifications@github.com wrote:

We can look into the FE_INVALID flag http://en.cppreference.com/w/cpp/header/cfenv but it was a bit different before the C++11 standard. But the basic idea would be to clear it immediately before the gradient is evaluated and check it immediately after.

— Reply to this email directly or view it on GitHub.

syclik commented 9 years ago

@bgoodri, I'm having trouble to get cfenv / fenv.h to properly catch when things get to NaN. I might not be understanding how this works. Is that the best reference to look at?

bgoodri commented 9 years ago

I think so, at least for the C++11 case. Does it work as expected if you flip that on?

On Wed, Mar 11, 2015 at 10:00 PM, Daniel Lee notifications@github.com wrote:

@bgoodri https://github.com/bgoodri, I'm having trouble to get cfenv / fenv.h to properly catch when things get to NaN. I might not be understanding how this works. Is that the best reference to look at?

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/stan/issues/1308#issuecomment-78410790.

syclik commented 9 years ago

Haven't checked yet. I'll let you know, maybe tomorrow.

On Wed, Mar 11, 2015 at 10:05 PM, bgoodri notifications@github.com wrote:

I think so, at least for the C++11 case. Does it work as expected if you flip that on?

On Wed, Mar 11, 2015 at 10:00 PM, Daniel Lee notifications@github.com wrote:

@bgoodri https://github.com/bgoodri, I'm having trouble to get cfenv / fenv.h to properly catch when things get to NaN. I might not be understanding how this works. Is that the best reference to look at?

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/stan/issues/1308#issuecomment-78410790.

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/stan/issues/1308#issuecomment-78411230.

bob-carpenter commented 9 years ago

See:

http://stackoverflow.com/questions/15655070/how-to-detect-double-precision-floating-point-overflow-and-underflow

These are POSIX, C99, and C++11, but not C++03 according to this.

So no point putting it in until we have the go-ahead to use C++11 on R and Python.

On Mar 12, 2015, at 1:09 PM, Daniel Lee notifications@github.com wrote:

Haven't checked yet. I'll let you know, maybe tomorrow.

On Wed, Mar 11, 2015 at 10:05 PM, bgoodri notifications@github.com wrote:

I think so, at least for the C++11 case. Does it work as expected if you flip that on?

On Wed, Mar 11, 2015 at 10:00 PM, Daniel Lee notifications@github.com wrote:

@bgoodri https://github.com/bgoodri, I'm having trouble to get cfenv / fenv.h to properly catch when things get to NaN. I might not be understanding how this works. Is that the best reference to look at?

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/stan/issues/1308#issuecomment-78410790.

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/stan/issues/1308#issuecomment-78411230.

— Reply to this email directly or view it on GitHub.

syclik commented 9 years ago

I just checked with the current version of CmdStan and I'm not seeing any issues. Can I close this? @jl3631, does your model run now?

jl3631 commented 9 years ago

Hi Lee:

Thanks. Do you mean that you have updated the stan version to solve the problem? Or there is no issue at all.
I revised my model a little bit so I rarely run into have these NaN problems. So please close it.
Really appreciate your help on this.

Best, Jia

From: Daniel Lee notifications@github.com<mailto:notifications@github.com> Reply-To: stan-dev/stan reply@reply.github.com<mailto:reply@reply.github.com> Date: Wednesday, March 25, 2015 at 3:19 PM To: stan-dev/stan stan@noreply.github.com<mailto:stan@noreply.github.com> Cc: Jia Liu jl3631@columbia.edu<mailto:jl3631@columbia.edu> Subject: Re: [stan] NaN error message for parameter (#1308)

I just checked with the current version of CmdStan and I'm not seeing any issues. Can I close this? @jl3631https://github.com/jl3631, does your model run now?

— Reply to this email directly or view it on GitHubhttps://github.com/stan-dev/stan/issues/1308#issuecomment-86179979.

syclik commented 9 years ago

@jl3631: thank you. The Stan development version (unreleased) doesn't have an issue on the model that was posted in the comments. I'm glad the revised model works for you.