veg / hyphy

HyPhy: Hypothesis testing using Phylogenies
http://www.hyphy.org
Other
205 stars 69 forks source link

Segmentation faults running tests on FreeBSD #1555

Closed Jehops closed 1 year ago

Jehops commented 1 year ago

Hello,

When running tests on FreeBSD, there are usually segmentation faults, sometimes for the FEL and/or CONTRAST-FEL tests, sometimes for others tests. Here are two example test runs.

root@13amd64-default:/usr/ports/biology/hyphy # uname -a
FreeBSD 13amd64-default 13.1-RELEASE-p2 FreeBSD 13.1-RELEASE-p2 amd64
root@13amd64-default:/usr/ports/biology/hyphy # make test
===>  Testing for hyphy-2.5.46
===>   hyphy-2.5.46 depends on executable: bash - found
Set default compiler flags to -fsigned-char -O3  -msse4.1
/usr/local/lib/libcurl.so
Node not installed; API documentation will not be generated
-- Could NOT find MPI_C (missing: MPI_C_LIB_NAMES MPI_C_HEADER_DIR MPI_C_WORKS)
-- Could NOT find MPI_CXX (missing: MPI_CXX_LIB_NAMES MPI_CXX_HEADER_DIR MPI_CXX_WORKS)
-- Could NOT find MPI (missing: MPI_C_FOUND MPI_CXX_FOUND)
-- Configuring done
-- Generating done
-- Build files have been written to: /wrkdirs/usr/ports/biology/hyphy/work/hyphy-2.5.46
ninja: no work to do.
[  0% 1/1] cd /wrkdirs/usr/ports/biology/hyphy/work/hyphy-2.5.46 && /usr/local/bin/ctest --force-new-ctest-process
Test project /wrkdirs/usr/ports/biology/hyphy/work/hyphy-2.5.46
      Start  1: UNIT-TESTS
 1/20 Test  #1: UNIT-TESTS .......................   Passed    1.00 sec
      Start  2: CODON
 2/20 Test  #2: CODON ............................   Passed    0.63 sec
      Start  3: PROTEIN
 3/20 Test  #3: PROTEIN ..........................   Passed    5.99 sec
      Start  4: MTCODON
 4/20 Test  #4: MTCODON ..........................   Passed   20.60 sec
      Start  5: ALGAE
 5/20 Test  #5: ALGAE ............................   Passed    7.22 sec
      Start  6: CILIATES
 6/20 Test  #6: CILIATES .........................   Passed   10.85 sec
      Start  7: SLAC
 7/20 Test  #7: SLAC .............................   Passed    2.88 sec
      Start  8: SLAC-PARTITIONED
 8/20 Test  #8: SLAC-PARTITIONED .................***Exception: SegFault  2.48 sec
      Start  9: FEL
 9/20 Test  #9: FEL ..............................   Passed   18.95 sec
      Start 10: MEME
10/20 Test #10: MEME .............................   Passed   48.01 sec
      Start 11: MEME-PARTITIONED
11/20 Test #11: MEME-PARTITIONED .................   Passed   34.18 sec
      Start 12: BUSTED
12/20 Test #12: BUSTED ...........................   Passed   12.59 sec
      Start 13: BUSTED-SRV
13/20 Test #13: BUSTED-SRV .......................   Passed   20.76 sec
      Start 14: RELAX
14/20 Test #14: RELAX ............................   Passed   34.63 sec
      Start 15: FUBAR
15/20 Test #15: FUBAR ............................***Exception: SegFault  2.26 sec
      Start 16: BGM
16/20 Test #16: BGM ..............................   Passed    3.00 sec
      Start 17: CONTRAST-FEL
17/20 Test #17: CONTRAST-FEL .....................***Exception: SegFault 15.59 sec
      Start 18: GARD
18/20 Test #18: GARD .............................   Passed    5.13 sec
      Start 19: FADE
19/20 Test #19: FADE .............................   Passed   28.41 sec
      Start 20: ABSREL
20/20 Test #20: ABSREL ...........................   Passed   31.54 sec

85% tests passed, 3 tests failed out of 20

Total Test time (real) = 306.73 sec

The following tests FAILED:
          8 - SLAC-PARTITIONED (SEGFAULT)
         15 - FUBAR (SEGFAULT)
         17 - CONTRAST-FEL (SEGFAULT)
root@14amd64-default:/usr/ports/biology/hyphy # uname -a
FreeBSD 14amd64-default 14.0-CURRENT FreeBSD 14.0-CURRENT 1400074 amd64
root@14amd64-default:/usr/ports/biology/hyphy # make test
===>  Testing for hyphy-2.5.46
===>   hyphy-2.5.46 depends on executable: bash - found
Set default compiler flags to -fsigned-char -O3  -march=native -mtune=native -mavx2 -mfma
/usr/local/lib/libcurl.so
Node not installed; API documentation will not be generated
-- Could NOT find MPI_C (missing: MPI_C_LIB_NAMES MPI_C_HEADER_DIR MPI_C_WORKS)
-- Could NOT find MPI_CXX (missing: MPI_CXX_LIB_NAMES MPI_CXX_HEADER_DIR MPI_CXX_WORKS)
-- Could NOT find MPI (missing: MPI_C_FOUND MPI_CXX_FOUND)
-- Configuring done
-- Generating done
CMake Warning:
  Manually-specified variables were not used by the project:

    BUILD_TESTING

-- Build files have been written to: /wrkdirs/usr/ports/biology/hyphy/work/hyphy-2.5.46
ninja: no work to do.
[  0% 1/1] cd /wrkdirs/usr/ports/biology/hyphy/work/hyphy-2.5.46 && /usr/local/bin/ctest --force-new-ctest-process
Test project /wrkdirs/usr/ports/biology/hyphy/work/hyphy-2.5.46
      Start  1: UNIT-TESTS
 1/20 Test  #1: UNIT-TESTS .......................   Passed    1.08 sec
      Start  2: CODON
 2/20 Test  #2: CODON ............................   Passed    0.31 sec
      Start  3: PROTEIN
 3/20 Test  #3: PROTEIN ..........................   Passed    3.33 sec
      Start  4: MTCODON
 4/20 Test  #4: MTCODON ..........................   Passed    9.06 sec
      Start  5: ALGAE
 5/20 Test  #5: ALGAE ............................   Passed    3.80 sec
      Start  6: CILIATES
 6/20 Test  #6: CILIATES .........................   Passed    5.28 sec
      Start  7: SLAC
 7/20 Test  #7: SLAC .............................   Passed    2.72 sec
      Start  8: SLAC-PARTITIONED
 8/20 Test  #8: SLAC-PARTITIONED .................   Passed    9.29 sec
      Start  9: FEL
 9/20 Test  #9: FEL ..............................   Passed    9.30 sec
      Start 10: MEME
10/20 Test #10: MEME .............................   Passed   28.34 sec
      Start 11: MEME-PARTITIONED
11/20 Test #11: MEME-PARTITIONED .................   Passed   22.78 sec
      Start 12: BUSTED
12/20 Test #12: BUSTED ...........................   Passed    9.18 sec
      Start 13: BUSTED-SRV
13/20 Test #13: BUSTED-SRV .......................   Passed   16.10 sec
      Start 14: RELAX
14/20 Test #14: RELAX ............................   Passed   22.60 sec
      Start 15: FUBAR
15/20 Test #15: FUBAR ............................   Passed    2.87 sec
      Start 16: BGM
16/20 Test #16: BGM ..............................   Passed    2.49 sec
      Start 17: CONTRAST-FEL
17/20 Test #17: CONTRAST-FEL .....................***Exception: SegFault  3.78 sec
      Start 18: GARD
18/20 Test #18: GARD .............................   Passed    5.22 sec
      Start 19: FADE
19/20 Test #19: FADE .............................   Passed   26.91 sec
      Start 20: ABSREL
20/20 Test #20: ABSREL ...........................   Passed   16.25 sec

95% tests passed, 1 tests failed out of 20

Total Test time (real) = 200.69 sec

The following tests FAILED:
         17 - CONTRAST-FEL (SEGFAULT)
spond commented 1 year ago

Dear @Jehops,

Thanks so much for reporting. I made some significant code changes to improve computational performance, and this allowed some silly bugs to creep in. These were all "read-faults", so other systems (OS X and CentOS, where I do most of the dev and testing) did not catch them.

Moving forward, I'll make sure to run the entire test suite with the address sanitizer turned out before pushing out new releases.

In the meantime, I patched up and re-released 2.5.46.

Best, Sergei

Jehops commented 1 year ago

Thank you kindly @spond! The tests do indeed look good with the 2.5.46hf1 tag.

One nit-picky request. Our port/package system converts tags like 2.5.46hf1 to OS package versions 2.5.46.h1 and those are interpreted to be something of a pre-release to 2.5.46.

% pkg version -t hyphy-2.5.46.h1 hyphy-2.5.46
<

This means if a user were to have hyphy-2.5.46 installed, pkg upgrade won't pull in hyphy-2.5.46.h1. The issue doesn't apply here, because our package is still at hyphy-2.5.43, and I'll directly upgrade to hyphy-2.5.46.h1. If there are no constraints on your side, perhaps a tag like 2.5.46.1 would work in these situations.

spond commented 1 year ago

Dear @Jehops,

Great suggestion. I'll use continued '.' notation in the future since the name of the tag does not matter to me internally. Thanks again for reporting the memory issue.

Best, Sergei