Closed DaGaMs closed 3 years ago
Hi Ben, I can't find a commit with ID 3c393a but this looks like a compiler/library problem to me since always_inline
is not used in Octopus code. I'd strongly recommend using the installation command:
$ ./scripts/install.py --dependencies
That way you can be confident you're using recent library and compiler versions, which can bring performance improvements. It's also much easier for me to resolve any installation problems since I can easily replicate.
Sorry, took the wrong end of the shasum. I just meant commit d4a7f4a7fcc56cc29fbe13c4ac926964993c393a
using the installer isn't an option, unfortunately, because I'm building octopus as a reusable module
and the installer puts hard-coded path for dependencies in place...
I have another one for you - I realised that I compiled on a virtual machine, which meant that hte SSE4.2 detection etc didn't work. I now compile on a cluster node, and I run into this:
Scanning dependencies of target octopus
[ 9%] Building CXX object src/CMakeFiles/octopus.dir/main.cpp.o
In file included from /users/bschuster/sharedscratch/octopus/src/core/models/pairhmm/simd_pair_hmm_factory.hpp:12,
from /users/bschuster/sharedscratch/octopus/src/core/models/pairhmm/pair_hmm.hpp:27,
from /users/bschuster/sharedscratch/octopus/src/core/models/haplotype_likelihood_model.hpp:26,
from /users/bschuster/sharedscratch/octopus/src/core/models/haplotype_likelihood_array.hpp:25,
from /users/bschuster/sharedscratch/octopus/src/core/callers/caller.hpp:25,
from /users/bschuster/sharedscratch/octopus/src/core/callers/caller_factory.hpp:11,
from /users/bschuster/sharedscratch/octopus/src/config/option_collation.hpp:17,
from /users/bschuster/sharedscratch/octopus/src/main.cpp:14:
/users/bschuster/sharedscratch/octopus/src/core/models/pairhmm/avx512_pair_hmm_impl.hpp: In static member function ‘static octopus::hmm::simd::AVX512PairHMMInstructionSet<BandSize, ScoreTp>::VectorType octopus::hmm::simd::AVX512PairHMMInstructionSet<BandSize, ScoreTp>::do_vectorise_zero_set_last(octopus::hmm::simd::AVX512PairHMMInstructionSet<BandSize, ScoreTp>::ScoreType, short int)’:
/users/bschuster/sharedscratch/octopus/src/core/models/pairhmm/avx512_pair_hmm_impl.hpp:216:40: error: there are no arguments to ‘_mm512_set_epi16’ that depend on a template parameter, so a declaration of ‘_mm512_set_epi16’ must be available [-fpermissive]
_mm512_set_epi16(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0));
^~~~~~~~~~~~~~~~
/users/bschuster/sharedscratch/octopus/src/core/models/pairhmm/avx512_pair_hmm_impl.hpp:216:40: note: (if you use ‘-fpermissive’, G++ will accept your code, but allowing the use of an undeclared name is deprecated)
At global scope:
cc1plus: error: unrecognized command line option ‘-Wno-deprecated-copy’ [-Werror]
cc1plus: all warnings being treated as errors
make[2]: *** [src/CMakeFiles/octopus.dir/main.cpp.o] Error 1
make[1]: *** [src/CMakeFiles/octopus.dir/all] Error 2
make: *** [all] Error 2
Any ideas?
Does your system support AVX-512F and AVX-512BW? What's the output of grep avx /proc/cpuinfo
? It looks like your compiler thinks that AVX-512 is available - so Octopus build tries to include it - but then the relevant headers aren't there. Could be a problem with how your version of GCC was built.
Regarding the dynamic linked dependencies, can you not just put the Octopus directory in a shared location and symlink the binary to where it's needed? That's how Homebrew works (package files go to /usr/local/Cellar
then the binaries are symlinked to /usr/local/bin
).
yes, AVX-512F and AVX-512BW are definitely available. You might be right that this is an issue with this particular build of gcc-8.3.0. I tried building my own clang and then compiling octopus with that, but I ran into some C library hell at the linking stage 😩
As for why not to use the ./install.py -D
: there are a few technical issues, e.g. I have to compile on a machine where I don't have access to the shared directory that the app needs to be installed to in order to be usable as a module
. But also, it feels really messy to dump several GB of dependencies into a "publicly" shared location, many of which are already available on the system (boost, gmp, gcc, curl, htslib etc). So I'm not giving up yet trying to find a way to do this "properly"...
Perhaps another avenue to explore is static linking... I've just pushed some commits that improve that. If I build using
$ ./install.py --dependencies --static
on rescomp1, I get :
$ ldd /well/gerton/dan/apps/octopus/bin/octopus
linux-vdso.so.1 => (0x00007fff25183000)
libdl.so.2 => /gpfs3/well/gerton/dan/apps/octopus/build/brew/lib/libdl.so.2 (0x00007fb07f891000)
libm.so.6 => /gpfs3/well/gerton/dan/apps/octopus/build/brew/lib/libm.so.6 (0x00007fb07f790000)
libpthread.so.0 => /gpfs3/well/gerton/dan/apps/octopus/build/brew/lib/libpthread.so.0 (0x00007fb07f770000)
libc.so.6 => /gpfs3/well/gerton/dan/apps/octopus/build/brew/lib/libc.so.6 (0x00007fb07f4d4000)
/gpfs3/well/gerton/dan/apps/octopus/build/brew/lib/ld.so => /lib64/ld-linux-x86-64.so.2 (0x00007fb07f676000)
libmvec.so.1 => /gpfs3/well/gerton/dan/apps/octopus/build/brew/lib/libmvec.so.1 (0x00007fb07f744000)
Eliminating libc
seems to be a bit of a pain, so I wonder if it's possible to use the system libraries for these remaining dynamically linked libraries...
But also, it feels really messy to dump several GB of dependencies into a "publicly" shared location, many of which are already available on the system (boost, gmp, gcc, curl, htslib etc).
Guess you're not a fan of Docker then :wink:
Ok, I've just committed (31e163c101f80181547e7add5ca8db4aa1306e58) a change that should enable a fully static binary. Just install using:
$ .scripts/install.py --dependencies --static
and double check you get
$ ldd bin/octopus
not a dynamic executable
I've only tested on a CentOS 7 machine so far and can't say how the static linking will affect performance...
Whoops, missed something - make that 78ac66b5d9ee21170121e6887c0653e7ec6efb25.
It did compile ok, but when I want to run it, I get Segmentation fault
no matter what... 🤨
Hmm, what OS are you using to build? And are you using the binary on the same machine used to build?
This cluster uses some flavour of Centos7. /proc/version
says
Linux version 3.10.0-1062.1.2.el7.x86_64 (mockbuild@kbuilder.bsys.centos.org) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-39) (GCC) ) #1 SMP Mon Sep 30 14:19:46 UTC 2019
I compiled successfully, then typed ./octopus
, and I get the said error. The only difference I can see here is that you seem to have built and linked with clang and ldd, whereas the install.py script seems to use gcc and ld in my case? Anyway, a quick file
on the binary yields:
$ file octopus
octopus: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked (uses shared libs), for GNU/Linux 2.6.32, not stripped
Not sure what "statically linked (uses shared libs)" means - that seems a bit contradictory, but I assumed it was because of the libc issue.
I'm trying to reproduce this in a Docker container but not getting anywhere:
$ docker run -t -i centos:7 /bin/bash
$ yum -y update
$ yum -y groupinstall 'Development Tools'
$ yum -y install curl file git which perl-devel python3
$ pip3 install distro
$ cd /home
$ git clone https://github.com/luntergroup/octopus.git
$ cd octopus
$ ./scripts/install.py --dependencies --static
$ file bin/octopus
bin/octopus: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), statically linked, for GNU/Linux 2.6.32, not stripped
$ bin/octopus --version
octopus version 0.7.0 (develop 9d066b94)
Target: x86_64 Linux 4.19.76-linuxkit
SIMD extension: AVX2
Compiler: GNU 10.2.0
Boost: 1_73
Everything works as expected...
If you can find a Docker image that reproduces the issue then that would be very helpful. Otherwise I would suggest trying installing again afresh as there have been some changes to the installed dependencies that could have been causing the problem:
$ git pull
$ .scripts/install.py --dependencies --static --clean
I'm trying to build commit 3c393a on our cluster, using gcc 8.3.0, boost 1.73.0, gmp 6.2.0, htslib 1.10.2, cmake 3.17.3 and python 3.8.1. I get the following error:
Any idea what this might be about? I'm usure how to even try to debug this. I had no issues compiling 0.6.3-beta in the same way, for the record.