rmcelreath / stat_rethinking_2023

Statistical Rethinking Course for Jan-Mar 2023
Creative Commons Zero v1.0 Universal
2.2k stars 248 forks source link

phylogenetic imputation of a binary predictor #7

Open tbendixen opened 1 year ago

tbendixen commented 1 year ago

Thanks for yet another round of awesome lectures!

I have a question about the imputation procedure in the missing data lecture.

So, in the following model, the primate phylogeny is used to impute missing data in predictor G (group size).

mBMG_OU3 <- ulam(
    alist(
        B ~ multi_normal( mu , K ),
        mu <- a + bM*M + bG*G,
        G ~ multi_normal( nu , KG ),
        nu <- aG + bMG*M,
        M ~ normal(0,1),
        matrix[N_spp,N_spp]:K <- cov_GPL1(Dmat,etasq,rho,0.01),
        matrix[N_spp,N_spp]:KG <- cov_GPL1(Dmat,etasqG,rhoG,0.01),
        c(a,aG) ~ normal( 0 , 1 ),
        c(bM,bG,bMG) ~ normal( 0 , 0.5 ),
        c(etasq,etasqG) ~ half_normal(1,0.25),
        c(rho,rhoG) ~ half_normal(3,0.25)
    ), data=dat_all , chains=4 , cores=4 , sample=TRUE )

My question is, what if G was a binary predictor? For instance, we might code a species either as solitary (S=0) or social (S=1) and use that to predict brain size B. We then want to use phylogenetic information in the imputation of S.

My guess is that the likelihood for S would not be multivariate normal, but how would the code look like then?

Thanks in advance!