statnet / ergm

Fit, Simulate and Diagnose Exponential-Family Models for Networks
Other
94 stars 36 forks source link

Bus error when running `ergm`: `address 0x0, cause 'invalid alignment'` #565

Closed barracuda156 closed 1 week ago

barracuda156 commented 2 weeks ago

I have encountered the following when running broom tests:

* using R version 4.4.1 (2024-06-14)
* using platform: powerpc-apple-darwin10.0.0d2 (32-bit)
* R was compiled by
    gcc-mp-13 (MacPorts gcc13 13.2.0_4+stdlib_flag) 13.2.0
    GNU Fortran (MacPorts gcc13 13.2.0_4+stdlib_flag) 13.2.0
* running under: OS X Snow Leopard 10.6
. . .
* checking examples ...sh: line 1: 64186 Bus error               LANGUAGE=en _R_CHECK_INTERNALS2_=1 '/opt/local/Library/Frameworks/R.framework/Resources/bin/R' --vanilla --encoding=UTF-8 > 'broom-Ex.Rout' 2>&1 < 'broom-Ex.R'
 ERROR
Running examples in ‘broom-Ex.R’ failed
The error most likely occurred in:

> ### Name: tidy.ergm
> ### Title: Tidy a(n) ergm object
> ### Aliases: tidy.ergm ergm_tidiers
> 
> ### ** Examples
> 
> ## Don't show: 
> if (rlang::is_installed("ergm")) (if (getRversion() >= "3.4") withAutoprint else force)({ # examplesIf
+ ## End(Don't show)
+ 
+ # load libraries for models and data
+ library(ergm)
+ 
+ # load the Florentine marriage network data
+ data(florentine)
+ 
+ # fit a model where the propensity to form ties between
+ # families depends on the absolute difference in wealth
+ gest <- ergm(flomarriage ~ edges + absdiff("wealth"))
+ 
+ # show terms, coefficient estimates and errors
+ tidy(gest)
+ 
+ # show coefficients as odds ratios with a 99% CI
+ tidy(gest, exponentiate = TRUE, conf.int = TRUE, conf.level = 0.99)
+ 
+ # take a look at likelihood measures and other
+ # control parameters used during MCMC estimation
+ glance(gest)
+ glance(gest, deviance = TRUE)
+ glance(gest, mcmc = TRUE)
+ ## Don't show: 
+ }) # examplesIf
> library(ergm)
Loading required package: network

‘network’ 1.18.2 (2023-12-04), part of the Statnet Project
* ‘news(package="network")’ for changes since last version
* ‘citation("network")’ for citation information
* ‘https://statnet.org’ for help, support, and other information

‘ergm’ 4.6.0 (2023-12-17), part of the Statnet Project
* ‘news(package="ergm")’ for changes since last version
* ‘citation("ergm")’ for citation information
* ‘https://statnet.org’ for help, support, and other information

‘ergm’ 4 is a major update that introduces some backwards-incompatible
changes. Please type ‘news(package="ergm")’ for a list of major
changes.

> data(florentine)
> gest <- ergm(flomarriage ~ edges + absdiff("wealth"))
Starting maximum pseudolikelihood estimation (MPLE):
Obtaining the responsible dyads.
Evaluating the predictor and response matrix.
R(64186,0x96c408) malloc: *** error for object 0xffffffff: pointer being reallocated was not allocated
*** set a breakpoint in malloc_error_break to debug
R(64186,0x96c408) malloc: *** error for object 0x330435c0: incorrect checksum for freed object - object was probably modified after being freed.
*** set a breakpoint in malloc_error_break to debug
R(64186,0x96c408) malloc: *** error for object 0x330435c0: incorrect checksum for freed object - object was probably modified after being freed.
*** set a breakpoint in malloc_error_break to debug

 *** caught bus error ***
address 0x0, cause 'invalid alignment'

Traceback:
 1: set.objfn(lprec, c(obj))
 2: mple.existence(pl)
 3: ergm.mple(s, s.obs, init = init, control = control, verbose = verbose,     ...)
 4: ergm.fit(nw, target.stats, model, proposal, proposal.obs, info,     control, verbose, ...)
 5: ergm(flomarriage ~ edges + absdiff("wealth"))
 6: eval(ei, envir)
 7: eval(ei, envir)
 8: withVisible(eval(ei, envir))
 9: source(exprs = exprs, local = local, print.eval = print., echo = echo,     max.deparse.length = max.deparse.length, width.cutoff = width.cutoff,     deparseCtrl = deparseCtrl, skip.echo = skip.echo, ...)
10: (if (getRversion() >= "3.4") withAutoprint else force)({    library(ergm)    data(florentine)    gest <- ergm(flomarriage ~ edges + absdiff("wealth"))    tidy(gest)    tidy(gest, exponentiate = TRUE, conf.int = TRUE, conf.level = 0.99)    glance(gest)    glance(gest, deviance = TRUE)    glance(gest, mcmc = TRUE)})
An irrecoverable exception occurred. R is aborting now ...

Any idea why this may be happenning?

mbojan commented 1 week ago

All tests and examples pass on our end. I suspect some non-standard OS configuration or/and issues with R installation maybe?

barracuda156 commented 1 week ago

@mbojan Thank you for responding!

This is a “non-standard OS”, but the failure is untypical if not unique, and bus error usually suggests a bug in the code. Generally speaking, everything C/C++/Fortran must work, with sole exception of packages heavily relying on Apple SDK features (there are very few such packages; consider ps).

(For the context, we have about 5000 R packages in MacPorts, and I run tests wherever they are provided: normally everything works fine on the same OS.)

Could anything in ergm assume little-endian platform without checking or 64-bitness? From the error message, apparently alignment gets wrong.

mbojan commented 1 week ago

Thanks @barracuda156 . I don't think we've had this kind of problems before, and CRAN grinds the C code in the packages quite finely.

Any ideas @krivit ?

barracuda156 commented 1 week ago

CRAN has no checks for Big-endian (not even 64-bit BSD or Linux) and no checks for 32-bit platforms (AFAIK, not even i386). So it is of no help to detect any issues which are specific to either.

malloc errors may happen due to C++ runtime conflicts (GCC uses its own libstdc++, while OS has its own older library), that is a familiar issue, but I think we handle it pretty well for R-related stuff. At least I do not see malloc issues normally. Another reason may be allocating something beyond 32-bit address space (but that would be a bug, it should not be happening).

However malloc aside, bus error with a wrong alignment rather suggests that something is wrong with the code. Standard reasons are wrong endianness, assumed 64-bitness and one obscure one – wrong size of bool/spinlock (both are 4 byte in Darwin ppc ABI). If neither of these are likely to apply, then perhaps more debugging is needed. I do not know the code here: do we know what is supposed to be executed around the time of failure?

krivit commented 1 week ago

Based on the traceback, this appears to be an error in lpSolveAPI::set.objfn(). I don't think we can do anything about it. Perhaps we can capture the problematic inputs, confirm that the error is reproducible, and report a bug to the lpSolveAPI developers?

krivit commented 1 week ago

Something like this could be used to capture the inputs, I think:

trace(ergm:::mple.existence, quote(save(list=ls(), file="existence_dump.rda")))
barracuda156 commented 1 week ago

@krivit Thank you, this is helpful. I will look into lpSolveAPI.

barracuda156 commented 1 week ago

@krivit I guess this was my fault: considering that lpSolveAPI code comment, I added this patch:

--- inst/include/lp_types.h 2023-11-28 22:11:24
+++ inst/include/lp_types.h 2023-12-07 12:33:22
@@ -74,7 +74,11 @@
   #define CHAR_BIT  8
 #endif
 #ifndef MYBOOL
-  #define MYBOOL  unsigned char    /* Conserve memory, could be unsigned int */
+  #if defined(__APPLE__) && defined(__ppc__)
+    #define MYBOOL  unsigned int     /* Darwin ppc 32-bit ABI */
+  #else
+    #define MYBOOL  unsigned char    /* Conserve memory, could be unsigned int */
+  #endif
 #endif

However, apparently, the comment is inaccurate, and int does not work. Dropping the patch fixes the problem with running broom tests, everything passes cleanly now:

--->  Testing R-broom
Executing:  cd "/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_R_R-broom/R-broom/work/broom" && /opt/local/bin/R CMD check ./broom_1.0.6.tar.gz --no-manual --no-build-vignettes 
* using log directory ‘/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_R_R-broom/R-broom/work/broom/broom.Rcheck’
* using R version 4.4.1 (2024-06-14)
* using platform: powerpc-apple-darwin10.0.0d2 (32-bit)
* R was compiled by
    gcc-mp-13 (MacPorts gcc13 13.2.0_4+stdlib_flag) 13.2.0
    GNU Fortran (MacPorts gcc13 13.2.0_4+stdlib_flag) 13.2.0
* running under: OS X Snow Leopard 10.6
* using session charset: UTF-8
* using options ‘--no-manual --no-build-vignettes’
* checking for file ‘broom/DESCRIPTION’ ... OK
* checking extension type ... Package
* this is package ‘broom’ version ‘1.0.6’
* package encoding: UTF-8
* checking package namespace information ... OK
* checking package dependencies ... OK
* checking if this is a source package ... OK
* checking if there is a namespace ... OK
* checking for executable files ... OK
* checking for hidden files and directories ... OK
* checking for portable file names ... OK
* checking for sufficient/correct file permissions ... OK
* checking whether package ‘broom’ can be installed ... OK
* checking installed package size ... OK
* checking package directory ... OK
* checking ‘build’ directory ... OK
* checking DESCRIPTION meta-information ... OK
* checking top-level files ... OK
* checking for left-over files ... OK
* checking index information ... OK
* checking package subdirectories ... OK
* checking code files for non-ASCII characters ... OK
* checking R files for syntax errors ... OK
* checking whether the package can be loaded ... OK
* checking whether the package can be loaded with stated dependencies ... OK
* checking whether the package can be unloaded cleanly ... OK
* checking whether the namespace can be loaded with stated dependencies ... OK
* checking whether the namespace can be unloaded cleanly ... OK
* checking whether startup messages can be suppressed ... OK
* checking dependencies in R code ... OK
* checking S3 generic/method consistency ... OK
* checking replacement functions ... OK
* checking foreign function calls ... OK
* checking R code for possible problems ... OK
* checking Rd files ... OK
* checking Rd metadata ... OK
* checking Rd cross-references ... OK
* checking for missing documentation entries ... OK
* checking for code/documentation mismatches ... OK
* checking Rd \usage sections ... OK
* checking Rd contents ... OK
* checking for unstated dependencies in examples ... OK
* checking R/sysdata.rda ... OK
* checking installed files from ‘inst/doc’ ... OK
* checking files in ‘vignettes’ ... OK
* checking examples ... OK
* checking for unstated dependencies in ‘tests’ ... OK
* checking tests ...
  Running ‘spelling.R’
  Running ‘test-all.R’
 OK
* checking for unstated dependencies in vignettes ... OK
* checking package vignettes ... OK
* checking running R code from vignettes ...
  ‘adding-tidiers.Rmd’ using ‘UTF-8’... OK
  ‘available-methods.Rmd’ using ‘UTF-8’... OK
  ‘bootstrapping.Rmd’ using ‘UTF-8’... OK
  ‘broom.Rmd’ using ‘UTF-8’... OK
  ‘broom_and_dplyr.Rmd’ using ‘UTF-8’... OK
  ‘kmeans.Rmd’ using ‘UTF-8’... OK
 OK
* checking re-building of vignette outputs ... SKIPPED
* DONE

Status: OK

Thank you very much for pointing to lpSolveAPI!