MRE

The for-loop comparing INLA, internal C laplace and glm results, shows an over/underflow warning originating from laplace calculations in node_binomial.c. In different parts (e.g. line 940) , we exponentiate large numbers raising the overflow warning and resulting in Inf values which can lead to issues later down-stream.

load(system.file("extdata", "QA_glm_case1_data.RData", package = "abn")) # or download from here: http://r-bayesian-networks.org/source/Rcode/QA_glm_case2.tar.gz

## 1. plot of raw differences, a wide range of values since both poisson, bin and gaus distributions used.
## vast majority as almost identical, but some are rather different
#plot(mycache.inla$mlik-mycache.c$mlik);

## 2. also look at % differences - gives a crude overview
## as 1. so suggests perhaps not just floating point rounding issue e.g. in log transforms
perc<-100*(mycache.c$mlik-mycache.inla$mlik)/mycache.c$mlik;

## 3. get all mliks which are adrift by more than 1%
bad<-which(abs(perc)>1);

## go through each and check for issues
## 
mydat<-ex2.dag.data;## this data comes with abn see ?ex2.dag.data
mydat.std<-mydat;
## setup distribution list for each node
mydists<-list(b1="binomial",
              g1="gaussian",
              p1="poisson",
              b2="binomial",
              g2="gaussian",
              p2="poisson",
              b3="binomial",
              g3="gaussian",
              p3="poisson",
              b4="binomial",
              g4="gaussian",
              p4="poisson",
              b5="binomial",
              g5="gaussian",
              p5="poisson",
              b6="binomial",
              g6="gaussian",
              p6="poisson"
             );
## create standardised dataset for comparison with glm
for(i in 1:length(mydists)){if(mydists[[i]]=="gaussian"){## then std data for comparison with glm_case
                                                            mydat.std[,i]<-(mydat.std[,i]-mean(mydat.std[,i]))/sd(mydat.std[,i]);}
}
## create empty matrix which will be filled with nodes as needed
mydag<-matrix(rep(0,dim(mydat)[2]^2),ncol=dim(mydat)[2]);colnames(mydag)<-rownames(mydag)<-names(mydat);

## loop through each node which differed from INLA by at least 1% and compare with glm() modes
for(i in 1:length(bad)){

  mydag[,]<-0;## reset
  node<-mycache.c$child[bad[i]];pars<-mycache.c$node.defn[bad[i],];
  form<-as.formula(paste(colnames(mydag)[node],"~",paste(colnames(mydag)[which(pars==1)],collapse="+",sep=""),sep=""));
  family<-mydists[[node]];
  mydag[node,]<-pars;## copy "bad" node into DAG
  myres.c<-fitabn(dag.m=mydag,data.df=mydat,data.dists=mydists,max.mode.error=0,compute.fixed=TRUE);## use C
  myres.inla<-fitabn(dag.m=mydag,data.df=mydat,data.dists=mydists,max.mode.error=100,compute.fixed=TRUE,n.grid=NULL,std.area=FALSE);## use INLA
  myres.glm<-glm(form,data=mydat.std,family=family);
  cat("################ bad=",i,"#################\n");
  cat("\n# 1. glm()\n");print(coef(myres.glm));
  cat("\n# 2. C\n");print(myres.c$modes[[node]]);
  cat("\n# 3. INLA\n");print(myres.inla$modes[[node]]);
  cat("\n###########################################\n");
}

furrer-lab / abn

Many "known" overflows in node_binomial.c #68

MRE

Suggested solution