igraph / rigraph

igraph R package
https://r.igraph.org
543 stars 201 forks source link

Assertion failed: v->stor_begin != NULL. #770

Closed heathergaya closed 1 year ago

heathergaya commented 1 year ago

I am running an MCMC in NIMBLE using the parallel package. This model works fine when run not in paralle. When using clusterEvalQ, I received the error:

Error in checkForRemoteErrors(lapply(cl, recvResult)) : 
  3 nodes produced errors; first error: At core/core/vector.pmt:483 : Assertion failed: v->stor_begin != NULL. This is an unexpected igraph error; please report this as a bug, along with the steps to reproduce it.
Timing stopped at: 0.63 0.21 1717

Here is the code I was running (the three .txt files referenced are attached to this bug report):

library(nimble)
library(coda)
library(mcmcplots)

nimIBM <- nimbleCode({

  beta0 ~ dnorm(0,.3) #initial lambda intercept
  beta1 ~ dnorm(0,.3) #relationship to covariate (elevation) for initial lambda

  k ~ dexp(10) #movement sd

  delta0 ~ dnorm(0,.3) # location survival intercept 
  delta1 ~ dnorm(0,.3) # location survival relationship to elevation
  age0 ~ dnorm(0,.3) # age survival intercept 
  age1 ~ dnorm(0,.3) # survival relationship to age
  gam0 ~ dnorm(0,.3) # recruitment intercept 
  gam1 ~ dnorm(0,.3) # recruitment relationship to age

  omega ~ dunif(-.4,1) # potential function attraction/repulsion
  p.s ~ dunif(0,1) #sex ratio

  for(t in 1:nocc){
    phi_HY[t] ~ dunif(0,1)
  }

  for(q in 1:npix){
    for(t in 1:nocc){
    logit(loc.surv[q,t]) <- delta0+delta1*cov[q,t]
    phi.loc[q,t] <- loc.surv[q,t] #survival at pixel q
  } #end t
    lambda[q] <- exp(beta0 + beta1*cov[q,1])*(1/npix) #relative probability of any particular location
    lambda.recruit[q] <- exp(beta0 + beta1*cov[q,1])*(1/npix)
  } #end q

  Lambda.recruit[1] <- sum(lambda.recruit[1:npix]) ## Expected new recruits
  Lambda[1] <- sum(lambda[1:npix])

  psi <- Lambda[1]/M #initial alive/dead state in time t = 1

  for(i in 1:M){
    r[i,1] <- 0
    for(t in 2:nocc){
      r[i,t] <- nimEquals(z[i,t],1)*nimEquals(a[i,t], 0)*nimEquals(a[i,t-1],1)*(t-1)
      #if z = 1 and recruitable = 0 and recuitable last year = 1, then 1*1*1*(t-1) = t-1 (the time period it was recruited in)
      # if z = 1 and recruitable = 0 and recruitable last year =0, then 1*1*0*(t-1) = 0 (logically must have been recruited earlier)
    } #end t
    recruit.date[i] <- sum(r[i,1:nocc]) + 1 #what time period did I enter the pop
    recruit.date_eff[i] <- (sum(r[i,1:nocc]) + 1)*(1-nimEquals(sum(z[i,1:nocc]),0)) #what time period did I enter the pop; 0 if I was never real
  } #end i

  alpha0 ~ dnorm(10, .3)
  alpha1 ~ dunif(-2, 2)
  for(l in 1:3){
    d.prob[l] <- 1 #just need to make these all equal
  }
  age.prob[1:3] ~ ddirch(d.prob[1:3]) #probabilities of being age 0, 1, or 2 in year 1

  for(i in 1:M){
    for(t in 1:nocc){
      #logit(fecundity_age[i,t]) <- gam0 + gam1*(age[i,t]+1)  #recruitment prob changes w/ age and abundance of pop
      log(fecundity_dens[i,t]) <- alpha0+alpha1*Nt[t]
      #fecundity[i,t] <- fecundity_age[i,t]*fecundity_dens[i,t]*sex[i]*z[i,t] #density dependence
      fecundity[i,t] <- fecundity_dens[i,t]*sex[i]*z[i,t] #density dependence
      nFledglings[i,t] ~ dpois(fecundity[i,t]) #though these guys don't show up in population counts till next year
      gamma[i,t] <- phi_HY[t]*fecundity[i,t] #males don't count for fecundity  
    } #end t

    sex[i] ~ dbern(p.s) #chance of either sex
    s[i,1,1] ~ dunif(xmin, xmax)  #x coord; even non-real individuals get a location
    s[i,2,1] ~ dunif(ymin, ymax) #y coord
    z[i,1] ~ dbern(psi) #alive/real
    parent[i,1] ~ dcat(z[1:M,1])
    pix_x[i,1] <- nimRound((s[i,1,1]+ .5*pxw)/pxw-xoffset)
    pix_y[i,1] <- nimRound((s[i,2,1]+ .5*pxw)/pxw-yoffset)
    pix[i,1] <- pixels[pix_x[i,1],pix_y[i,1]]
    zeros1[i] ~ dpois(-log(lambda[pix[i,1]]/Lambda[1]))
    a[i,1] <- 1-z[i,1] #available for recruitment 
    age.pre[i] ~ dcat(age.prob[1:3]) #can be 0, 1, or 2 at year 1 but this spits out 1,2,3
    age[i,1] <- (age.pre[i]-1)*z[i,1] #initial age distribution; ages can be 0, 1, or 2
  } #end i

  for(aa in 1:3){ #can only be age 0, 1, or 2
    logit(phi.age[aa]) <- age0 + age1*(aa-1) #survival changes with age
  }

  for(t in 1:nocc){
  R[t] <- sum(gamma[1:M,t]) #expected number of SY for a given year
  Avail[t] <- sum(a[1:M,t]) #make sure this works
  Ent_Prob[t] <- R[t]/Avail[t]
  }

  for(t in 2:nocc){
    gamma_z[1:M,t-1] <- gamma[1:M, t-1]*z[1:M,t-1]
    for(i in 1:M){
      parent[i,t] ~ dcat(gamma_z[1:M, t-1]) #who is my parent, is irrelevant if I'm already alive
      #age.class[i,t] <- age[i,t-1]+1 #can't be older than 2, but phi.age uses (1,2,3) rather than (0,1,2)
      phi.p[i,t] <- phi.loc[pix[i,t-1],t]*phi.age[age[i,t-1]+1]*z[i,t-1]+ a[i,t-1]*Ent_Prob[t-1] #gamma[parent[i,t],t-1] #effective survival or recruitment depending on previous recruitment status
      z[i,t] ~ dbern(phi.p[i,t]) #survival/recruitment determine status

      s[i,1,t] ~ T(dnorm((s[i,1,t-1]+omega*east[pix[i,t-1],t])*(1-a[i,t-1]) + s[parent[i,t], 1, t-1]*a[i,t-1], 1/(k^2)), xmin, xmax) #if I was previous alive, based on my previous location, otherwise based on my parents location
      s[i,2,t] ~  T(dnorm((s[i,2,t-1]+omega*north[pix[i,t-1],t])*(1-a[i,t-1]) + s[parent[i,t], 2, t-1]*a[i,t-1],1/(k^2)), ymin, ymax)
      pix_x[i,t] <- nimRound((s[i,1,t]+ .5*pxw)/pxw - xoffset)
      pix_y[i,t] <- nimRound((s[i,2,t]+ .5*pxw)/pxw - yoffset)
      pix[i,t] <- pixels[pix_x[i,t], pix_y[i,t]]
      a[i,t] <- (1-z[i,t])*a[i,t-1] #available for recruitment 
      age[i,t] <- equals(age[i,t-1], 2)*z[i,t] + (1-equals(age[i,t-1],2))*z[i,t]*(age[i,t-1]+1)
      #age[i,t] <- ifelse(age[i,t-1] ==2 ,2*z[i,t], (age[i,t-1]+1)*z[i,t]) #age 0 if you're not alive or if you're dead
    } #end i
    Nt[t] <- sum(z[1:M,t])
  } #end t

  Nt[1] <- sum(z[1:M,1])

  #Nimble being annoying:
  for(t in 1:nocc){
    for(i in 1:M){
      rrr[i,t] <- nimEquals(recruit.date_eff[i],t)
    }
    Recruits[t] <- sum(rrr[1:M,t])
  }

  ### Detection Process 
  g0 ~ dunif(0,1)
  sigma ~ dunif(0, 100) #no idea
    for(t in 1:nocc){
      for(kk in 1:ntrap){
        for(i in 1:M){
        dist[i,kk,t] <- sqrt((s[i,1,t]-trap[kk,1])^2 + (s[i,2,t]-trap[kk,2])^2)
        p[i,kk,t] <- g0*exp(-dist[i,kk,t]^2/(2*sigma^2))
        }
        for(i in 1:nind){
      y[i,kk,t] ~ dbin(p[i,kk,t]*z[i,t], open[kk,t])
        }
    }
  }

  for(i in (nind+1):M){
    for(t in 1:nocc){
      PrAtLeastOne[i,t] <- 1-prod(1-(p[i,1:ntrap,1:nocc]*open[1:ntrap, 1:nocc]))
      zeros2[i,t] ~ dbern(PrAtLeastOne[i,t]*z[i,t])
    }
  }

})

pars = c("beta0", "beta1", "k", "Nt", "Lambda", "psi", "age0", "age1", "delta0", 
         "delta1", "gam0", "gam1", "p.s", "Recruits", "omega", "alpha0", "alpha1", "phi_HY","g0", "sigma")

nc <- dget("nc.txt")
nd <- dget("nd.txt")
ni <- dget("ni.txt")

adaptInterval = 200
SamplerScale <- 0.05
library(parallel)
cl <- makeCluster(3)
clusterExport(cl = cl, varlist = c("nimIBM", "nc", "nd", "ni", "pars", "adaptInterval", "SamplerScale"))
system.time(
 birds.out <- clusterEvalQ(cl = cl,{
   library(nimble)
   library(coda)
    prepbirds <- nimbleModel(code = nimIBM, constants = nc,
                             data = nd, inits = ni, calculate = T, check = T)
    prepbirds$phi.p[,1] <- 1
    #prepbirds$initializeInfo()
    #prepbirds$calculate()
    mcmcbirds <- configureMCMC(prepbirds, monitors = pars, print = T, control = list(adaptInterval = adaptInterval, scale=.1))
    mcmcbirds$removeSampler(paste0("s[",1:(nc$nind),", 1:2, 1:",nc$nocc,"]"))
    #mcmcbirds$removeSampler(c("gam0", "gam1"))
    for(i in 1:(nc$nind)){
      for(tt in 1:(nc$nocc)){
        mcmcbirds$addSampler(target = paste0("s[",i,", 1:2, ",tt,"]"),
                             type = 'RW_block', silent=TRUE, 
                             control = list(adaptInterval = adaptInterval, propCov=diag(2), scale=SamplerScale, adaptScaleOnly=TRUE)) 
      }
    }

    birdsMCMC <- buildMCMC(mcmcbirds) 
    Cmodel <- compileNimble(prepbirds) #takes a long time (sometimes)
    Compbirds <- compileNimble(birdsMCMC, project = prepbirds)
    Compbirds$run(niter = 100000, nburnin = 20000, thin = 20)
    #birds.out <- as.mcmc(as.matrix(Compbirds$mvSamples))
    return(as.mcmc(as.matrix(Compbirds$mvSamples)))
  }) #this will take awhile and not produce any noticeable output.
)

nd.txt ni.txt nc.txt Version information


sessionInfo()
R version 4.2.2 (2022-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19044)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.utf8  LC_CTYPE=English_United States.utf8   
[3] LC_MONETARY=English_United States.utf8 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.utf8    

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] mcmcplots_0.4.3 coda_0.19-4     nimble_0.13.1  

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.9       lattice_0.20-45  codetools_0.2-18 grid_4.2.2       R6_2.5.1         magrittr_2.0.3  
 [7] rlang_1.0.6      cli_3.4.1        sp_1.5-1         raster_3.6-11    tools_4.2.2      igraph_1.3.5    
[13] sfsmisc_1.1-14   compiler_4.2.2   terra_1.6-47     pkgconfig_2.0.3  colorspace_2.0-3 denstrip_1.5.4 
krlmlr commented 1 year ago

Thanks for reporting. Just to manage expectations: how long does the example run until the error occurs? Would it also fail with fewer iterations? Do we need to fix the random seed to make the problem easier to reproduce? Meanwhile, I enjoy a bit of extra heat from my laptop 🙃

Could you please give me a rough idea where igraph is used in this setup?

szhorvat commented 1 year ago

Can you reproduce this with the latest version, 1.4.2?

szhorvat commented 1 year ago

I ran this code for 20 minutes using an ASan-instrumented build of R and igraph, and was unable to reproduce the problem.

nimble uses few igraph functions. I don't have an exhaustive list, but the only non-trivial functions I could find so far were topological.sort() and permute.vertices(). The latter received some relevant bugfixes last year, but I believe these were already included in 1.3.5.

While the assertion failure clearly indicates an issue with igraph, tracking this down will be quite difficult without a small example. The most helpful thing to check would be whether you can reproduce the issue with 1.4.2 (or 1.4.1), @heathergaya, let us know how long it takes to reproduce the issue, and whether it reproduces consistently.

szhorvat commented 1 year ago

One possible explanation here is that your computer ran out of memory, and the failure you saw was due to an unchecked OOM condition. Please check how much memory this calculation is using relative to how much you have available.

szhorvat commented 1 year ago

I am closing this since we are unable to reproduce the issue. @heathergaya, feel free to re-open if you can consistently reproduce it with the latest igraph version.

github-actions[bot] commented 4 months ago

This old thread has been automatically locked. If you think you have found something related to this, please open a new issue and link to this old issue if necessary.