timjzee / rethinking

Statistical Rethinking course and book package
0 stars 0 forks source link

Fix Error in symbols[[left_symbol]] : no such index at level 2 #2

Closed timjzee closed 4 months ago

timjzee commented 4 months ago

Very descriptive error....

Occurs when a multi_normal log_likelihood is calculated with more than 2 variables, e.g.:

N <- 500
U_sim <- rnorm( N )
Q_sim <- sample( 1:4 , size=N , replace=TRUE )
E_sim <- rnorm( N , U_sim + Q_sim )
W_sim <- rnorm( N , U_sim + 0*E_sim )
dat_sim <- list(
  U=standardize(U_sim) ,
  W=standardize(W_sim) ,
  E=standardize(E_sim) ,
  Q=standardize(Q_sim) )

m14.6 <- ulam(
  alist(
    c(U,W,E) ~ multi_normal( c(muU,muW,muE) , Rho , Sigma ),
    muU <- aU + bEU*E,
    muW <- aW + bEW*E,
    muE <- aE + bQE*Q,
    c(aU,aW,aE) ~ normal( 0 , 0.2 ),
    c(bEU,bEW,bQE) ~ normal( 0 , 0.5 ),
    Rho ~ lkj_corr( 2 ),
    Sigma ~ exponential( 1 )
  ), data=dat_sim , chains=4 , cores=4, log_lik=T, sample = F)

Gives:

Error in symbols[[left_symbol]] : no such index at level 2
timjzee commented 4 months ago

Easier to find than I thought:

https://github.com/timjzee/rethinking/blob/1a040c38c01652cfa495c3cff4f75ca1807352a0/R/ulam-function.R#L1227-L1228

timjzee commented 4 months ago

I think it actually happens here:

https://github.com/timjzee/rethinking/blob/1a040c38c01652cfa495c3cff4f75ca1807352a0/R/ulam-function.R#L1036-L1044

Conditional on log_lik=TRUE, and inside the conditional left_symbol is used as an index on symbols.

timjzee commented 4 months ago

I think I'm right because if I change the code to:

 if ( i==1 && log_lik==TRUE ) { 
     # add by default 
     built <- compose_distibution( left_symbol , flist[[i]] , as_log_lik=TRUE ) 
     m_gq2 <- concat( m_gq2 , built ) 
     message(left_symbol)
     N <- symbols[[left_symbol]]$dims[[2]]
     message("Worked") 
     m_gq1 <- concat( m_gq1 , indent , "vector[" , N , "] log_lik;\n" ) 
     # save N to attr so nobs/compare can get it later 
     nobs_save <- N 
 }

And run the test model, I get:

UWE
Error in symbols[[left_symbol]] : no such index at level 2
Calls: ulam
Execution halted

So left_symbol is printed resulting in UWE, but we do not see Worked, so the code is not executed beyond

N <- symbols[[left_symbol]]$dims[[2]]
timjzee commented 4 months ago

So something interesting happens when I test with two outcome variables:

m14.6 <- ulam(
  alist(
    c(W,E) ~ multi_normal( c(muW,muE) , Rho , Sigma ),
    muW <- aW + bEW*E,
    muE <- aE + bQE*Q,
    c(aW,aE) ~ normal( 0 , 0.2 ),
    c(bEW,bQE) ~ normal( 0 , 0.5 ),
    Rho ~ lkj_corr( 2 ),
    Sigma ~ exponential( 1 )
  ), data=dat_sim , chains=4 , cores=4, log_lik=T, sample = F)

cat(m14.6$model)

This does not give an error, but it does show an interesting bug in the generated Stan code:

generated quantities{
    vector[] log_lik;
     vector[500] muW;
     vector[500] muE;

So with two outcome variables the N in the function below could not be determined either:

 if ( i==1 && log_lik==TRUE ) { 
     # add by default 
     built <- compose_distibution( left_symbol , flist[[i]] , as_log_lik=TRUE ) 
     m_gq2 <- concat( m_gq2 , built ) 
     message(left_symbol)
     N <- symbols[[left_symbol]]$dims[[2]]
     message("Worked") 
     m_gq1 <- concat( m_gq1 , indent , "vector[" , N , "] log_lik;\n" ) 
     # save N to attr so nobs/compare can get it later 
     nobs_save <- N 
 }

We get this bug because WE is not in symbols. The same goes for UWE if we test with 3 outcome variables, but mysteriously in that case we get the error, which interrupts the building of the Stan code. Regardless, to fix it, we can simply use the first element of left_symbol--bit of a hack but hey it works.

N <- symbols[[left_symbol[1]]]$dims[[2]]
timjzee commented 4 months ago

Tested with a single outcome variable as well and it works.