Closed ashleyacevedo closed 7 months ago
Hi @ashleyacevedo. Thanks for using MuSE. MuSE sump
uses OpenMP for parallel computing. Please can you provide what the OS the cloud uses, and the gcc version you used for compiling MuSE 2.02? I am happy to help.
Thanks for the quick reply @jiyunmaths! Here is my Dockerfile
FROM ubuntu:20.04
ENV SAMTOOLS_VERSION=1.14
# Set working directory
WORKDIR /
ARG DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get install -y \
bzip2 \
gcc \
g++ \
libbz2-dev \
libcurl3-dev \
libncurses5-dev \
liblzma-dev \
make \
cmake \
autoconf \
libtool \
wget \
zlib1g-dev \
libssl-dev \
git \
ca-certificates cpp libltdl-dev unzip
# Pulling SAMTools from its repository, unpacking the archive and installing
ADD https://github.com/samtools/samtools/releases/download/${SAMTOOLS_VERSION}/samtools-${SAMTOOLS_VERSION}.tar.bz2 .
RUN tar xjvf samtools-${SAMTOOLS_VERSION}.tar.bz2 \
&& rm samtools-${SAMTOOLS_VERSION}.tar.bz2 \
&& cd samtools-${SAMTOOLS_VERSION} \
&& ./configure \
&& make \
&& make install \
&& cd ..
# Download MuSE-2.0 executable
ADD https://github.com/wwylab/MuSE/archive/refs/tags/v2.0.2.tar.gz .
RUN tar -xvf v2.0.2.tar.gz \
&& rm v2.0.2.tar.gz \
&& cd MuSE-2.0.2 \
&& ./install_muse.sh
# Move MuSE to PATH so that it can be run as a command
ENV PATH=/MuSE-2.0.2:$PATH
The gcc installed on this image is 9.4.0.
gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/9/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:hsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 9.4.0-1ubuntu1~20.04.2' --with-bugurl=file:///usr/share/doc/gcc-9/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --prefix=/usr --with-gcc-major-version-only --program-suffix=-9 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none=/build/gcc-9-9QDOt0/gcc-9-9.4.0/debian/tmp-nvptx/usr,hsa --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 9.4.0 (Ubuntu 9.4.0-1ubuntu1~20.04.2)
Hi @ashleyacevedo, I fixed a bug in the Makefile in V2.0.2 which disabled openmp
for parallel computing in the MuSE sump
step. The new release is V2.0.3. Please include this latest version in your Dockerfile. Thank you very much.
thank you @jiyunmaths for this update. I am working with @ashleyacevedo - what is your release process for uploading new versions to conda? https://anaconda.org/jiyunmaths/muse or do you recommend installing from source.
@Ben-Habermeyer Sorry I do not have a plan to release a new conda version at the moment. Please install it using the source code. Thanks.
Thank you so much @jiyunmaths 🙏
The issue is fixed.
I'm running
MuSE sump
as follows:MuSE sump -G -I calls.MuSE.txt -O muse.vcf -n 72 -D known_sites.vcf.gz
The runtime is >15 hours for a MuSE.txt file of ~6Gb and there appears to be no multi-processing. See example logging of machine usage:
I'm running MuSE-version 2.0.2 on a machine with 140Gb RAM, 72 cpu on the cloud. No stdout or stderr is produced.
Do you have any thoughts on what I might be doing wrong?