Gaius-Augustus / BRAKER

BRAKER is a pipeline for fully automated prediction of protein coding gene structures with GeneMark-ES/ET/EP/ETP and AUGUSTUS in novel eukaryotic genomes
Other
363 stars 81 forks source link

GUSHR fails at line 281 on stranded RNAseq BRAKER command #396

Closed ckeeling closed 1 year ago

ckeeling commented 3 years ago

Hello,

I'm using the following command with braker 2.1.6:

braker.pl --genome genome.fa --UTR=on --stranded=+,- --bam=merged_fwd.bam,merged_rev.bam --softmasking --cores 48

I get the following error:

ERROR in file /usr/local/bin/BRAKER/scripts/braker.pl at line 10347
Failed not execute /usr/bin/python3 /usr/local/bin/GUSHR/gushr.py -b /project/6015718/ckeeling/BRAKER2/RNAseq/merged_fwd.bam /project/6015718/ckeeling/BRAKER2/RNAseq/merged_rev.bam -t /project/6015718/ckeeling/BRAKER2/RNAseq/braker/augustus.hints.gtf -g /project/6015718/ckeeling/BRAKER2/RNAseq/braker/genome.fa -o /project/6015718/ckeeling/BRAKER2/RNAseq/braker/gushr -c 48 -s /usr/bin -a /usr/local/bin/Augustus/scripts -j /usr/bin -q 2 > /project/6015718/ckeeling/BRAKER2/RNAseq/braker/gushr.log 2> /project/6015718/ckeeling/BRAKER2/RNAseq/braker/errors/gushr.err!

gush.err is empty, but gushr.log shows:

Searching for samtools:
Will use /usr/bin/samtools
Searching for gtf2gff.pl:
Will use /usr/local/bin/Augustus/scripts/gtf2gff.pl
Trying to execute the following command:
/usr/bin/samtools sort -@ 48 /project/6015718/ckeeling/BRAKER2/RNAseq/merged_fwd.bam -o ./gushr-KHGQHBHHASFN/rnaseq_0_s.bam
Suceeded in executing command.
Error in file /usr/local/bin/GUSHR/gushr.py at line 281: Return code of subprocess was 1['/usr/bin/samtools', 'sort', '-@', '48', '/project/6015718/ckeeling/BRAKER2/RNAseq/merged_fwd.bam', '-o', './gushr-KHGQHBHHASFN/rnaseq_0_s.bam']
Will try again...
Trying to execute the following command:
/usr/bin/samtools sort -@ 48 /project/6015718/ckeeling/BRAKER2/RNAseq/merged_fwd.bam -o ./gushr-KHGQHBHHASFN/rnaseq_0_s.bam
Suceeded in executing command.
Error in file /usr/local/bin/GUSHR/gushr.py at line 281: Return code of subprocess was 1['/usr/bin/samtools', 'sort', '-@', '48', '/project/6015718/ckeeling/BRAKER2/RNAseq/merged_fwd.bam', '-o', './gushr-KHGQHBHHASFN/rnaseq_0_s.bam']
Will try again...
Trying to execute the following command:
/usr/bin/samtools sort -@ 48 /project/6015718/ckeeling/BRAKER2/RNAseq/merged_fwd.bam -o ./gushr-KHGQHBHHASFN/rnaseq_0_s.bam
Suceeded in executing command.
Error in file /usr/local/bin/GUSHR/gushr.py at line 281: Return code of subprocess was 1['/usr/bin/samtools', 'sort', '-@', '48', '/project/6015718/ckeeling/BRAKER2/RNAseq/merged_fwd.bam', '-o', './gushr-KHGQHBHHASFN/rnaseq_0_s.bam']
Will try again...
Trying to execute the following command:
/usr/bin/samtools sort -@ 48 /project/6015718/ckeeling/BRAKER2/RNAseq/merged_fwd.bam -o ./gushr-KHGQHBHHASFN/rnaseq_0_s.bam
Suceeded in executing command.
Error in file /usr/local/bin/GUSHR/gushr.py at line 281: Return code of subprocess was 1['/usr/bin/samtools', 'sort', '-@', '48', '/project/6015718/ckeeling/BRAKER2/RNAseq/merged_fwd.bam', '-o', './gushr-KHGQHBHHASFN/rnaseq_0_s.bam']

If I run the samtools command myself (without all the single quotes and commas) there is no error.

Any suggestions on a solution? Thanks, Chris

ckeeling commented 3 years ago

This seems to be a problem with GeMoMa (GeMoMa-1.6.2.jar) that is described elsewhere: https://github.com/Jstacs/Jstacs/issues/12 and https://github.com/Gaius-Augustus/GUSHR/issues/2. I am running BRAKER in a Singularity container, and thus GeMoMa.ini.xml cannot be written in a read-only container. To resolve this issue, I had to install GUSHR outside the container and run it external to the container. I included an environment file with the Singularity command as follows:

singularity exec --cleanenv --env-file config_file ....

Where the config_file contains: GUSHR_PATH=/path to GUSHR outside of container/GUSHR

The job is still running, at the GUSHR step, but GUSHR IS running.

This is a kludge to get it to work. Hopefully it can be resolved in another way within BRAKER/GUSHR in the future.

sanyalab commented 2 years ago

Hi Chris,

I want to use the Braker2 container you built from singularity hub. Two questions I had (I am just getting my hands dirty with singularity). Do you pull the latest Braker2 release when you build the container? Does the container get updated whenever there is a new release of Braker2?

Thanks Abhijit

ckeeling commented 2 years ago

Hello @sanyalab,

When I build the container, it pulls the most recent versions of most things. But, on singularity hub, the container is stored already built, so it is frozen at the time I built it (currently Oct 7, 2021 at 7:44 pm). Below is the recipe I used for the build if you'd like to build it yourself with any modification, it is not too hard on https://cloud.sylabs.io/builder if you create your own account there. See notes on registration of Gene-Mark tools in the recipe. This container was build for my own use, so it may not work on all systems.

Bootstrap: docker
From: ubuntu:latest

%labels
    MAINTAINER christopher.keeling@XXXXX.ca
    Version 20210622

%help

This is a container used to run braker2 
MAINTAINER christopher.keeling@ XXXXX.ca
(Not associated with braker2 code development)

For more information, see: 
https://github.com/Gaius-Augustus/BRAKER

Citation:
Tomas Bruna, Katharina J. Hoff, Alexandre Lomsadze, Mario Stanke and Mark Borodvsky. 2021. “BRAKER2: automatic eukaryotic genome annotation with GeneMark-EP+ and AUGUSTUS supported by a protein database." NAR Genomics and Bioinformatics 3(1):lqaa108.  

%runscript
    echo "This is a Singularity container used to run braker2:"

    echo "Container was created $NOW"

%post

TZ=America/Montreal
LANG='en_US.UTF-8'
LANGUAGE='en_US:en'
LC_ALL='en_US.UTF-8'

ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > /etc/timezone
apt-get -y update
apt-get install -y gnupg
gpg --list-keys

gpg --keyserver keyserver.ubuntu.com --recv-key E298A3A825C0D65DFD57CBB651716619E084DAB9
gpg -a --export E298A3A825C0D65DFD57CBB651716619E084DAB9 | apt-key add -
apt-get -y upgrade
apt-get install -y python3 python3-dev python3-setuptools git wget build-essential autoconf
apt-get install -y python3-pip python3-tk libx11-dev openjdk-8-jdk sudo

# Install dependencies for AUGUSTUS comparative gene prediction mode (CGP)
apt-get install -y libgsl-dev libboost-all-dev libsuitesparse-dev liblpsolve55-dev
apt-get install -y libsqlite3-dev libmysql++-dev

# Install dependencies for the optional support of gzip compressed input files
apt-get install -y libboost-iostreams-dev zlib1g-dev

# Install dependencies for bam2hints and filterBam
apt-get install -y libbamtools-dev

# Install additional dependencies for bam2wig
apt-get install -y samtools libhts-dev

# Install additional dependencies for homGeneMapping and utrrnaseq
apt-get install -y libboost-all-dev

# Install additional dependencies for scripts
apt-get install -y cdbfasta diamond-aligner libfile-which-perl libparallel-forkmanager-perl libyaml-perl libdbd-mysql-perl
apt-get install -y --no-install-recommends python3-biopython

apt-get install -y language-pack-en-base locales 
locale-gen en_US.UTF-8

pip3 install biopython

# Install CPAN-Perl modules
apt-get install -y cpanminus

sudo cpanm File::Spec::Functions File::HomeDir Hash::Merge List::Util MCE::Mutex Logger::Simple Module::Load::Conditional Parallel::ForkManager POSIX Scalar::Util::Numeric YAML Math::Utils threads

# Install dependencies
    cd /usr/local/bin

    apt-get install -y bamtools samtools spaln exonerate diamond-aligner cdbfasta

# GeneMark-ES/ET/EP
    ## Note that these links are only temporary ("ozhiW" part) due to registration. 
    ## You need to register and replace the links here.

    wget http://topaz.gatech.edu/GeneMark/tmp/GMtool_ozhiW/gmes_linux_64.tar.gz
    tar zxvf gmes_linux_64.tar.gz
    rm gmes_linux_64.tar.gz
    cd gmes_linux_64

    # Software key
    wget http://topaz.gatech.edu/GeneMark/tmp/GMtool_ozhiW/gm_key_64.gz
    gunzip gm_key_64.gz

    ## key must be in home directory with simplier name (outside of singularity container)
    ## cp /usr/local/bin/gmes_linux_64/gm_key_64 ~/.gm_key # To be done once container is created

    ## Change path to perl in scripts
    ./change_path_in_perl_scripts.pl /usr/bin/perl

    cd ..

# AUGUSTUS
    git clone https://github.com/Gaius-Augustus/Augustus.git
    cd Augustus
    make augustus
    make auxprogs
    cd auxprogs
    make
    cd /usr/local/bin/Augustus/auxprogs/filterBam
    make
    cd /usr/local/bin/Augustus/auxprogs/utrrnaseq
    make
    cd /usr/local/bin/Augustus/auxprogs/joingenes
    make
    cd /usr/local/bin/Augustus
    sudo make install

    cd /usr/local/bin

# ProtHint
    git clone https://github.com/gatech-genemark/ProtHint.git

# GenomeThreader
    wget https://genomethreader.org/distributions/gth-1.7.3-Linux_x86_64-64bit.tar.gz
    tar zxvf gth-1.7.3-Linux_x86_64-64bit.tar.gz
    rm gth-1.7.3-Linux_x86_64-64bit.tar.gz

# NCBI BLAST+
    wget https://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/ncbi-blast-2.11.0+-x64-linux.tar.gz
    tar zxvf ncbi-blast-2.11.0+-x64-linux.tar.gz
    rm ncbi-blast-2.11.0+-x64-linux.tar.gz

# cdbfasta
    git clone https://github.com/gpertea/cdbfasta.git
    cd cdbfasta
    make
    cd ..

# GUSHR
    git clone https://github.com/Gaius-Augustus/GUSHR.git

ln -s /usr/bin/python3 /usr/bin/python

# Install Braker2
echo "Installing Braker2..."
    git clone https://github.com/Gaius-Augustus/BRAKER.git

%environment
    export PATH="/usr/local/bin:/usr/local/sbin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/bin/ncbi-blast-2.11.0+/bin:/usr/local/bin/gmes_linux_64:/usr/local/bin/BRAKER/scripts:/usr/local/bin/ProtHint/bin:/usr/local/bin/gth-1.7.3-Linux_x86_64-64bit/bin:/usr/local/bin/cdbfasta"
    export AUGUSTUS_BIN_PATH="/opt/augustus-3.4.0/bin"
    export AUGUSTUS_SCRIPTS_PATH="/usr/local/bin/Augustus/scripts"
    export BSSMDIR="/usr/local/bin/gth-1.7.3-Linux_x86_64-64bit/bin/bssm"
    export GTHDATADIR="/usr/local/bin/gth-1.7.3-Linux_x86_64-64bit/bin/gthdata"
    export NOW=`date`
    export SINGULARITYENV_LD_LIBRARY_PATH="/usr/lib:/usr/local/lib:$LD_LIBRARY_PATH"

I ran it as follows:

singularity exec --cleanenv --env-file config_file braker2_latest.sif braker.pl --genome genome.fa --prot_seq proteins.fa --softmasking --cores 48

Where config_file contains:

AUGUSTUS_BIN_PATH=/opt/augustus-3.4.0/bin
AUGUSTUS_CONFIG_PATH=/path_to_your_working_directory_outside_container/config
AUGUSTUS_SCRIPTS_PATH=/usr/local/bin/Augustus/scripts
GUSHR_PATH=/usr/local/bin/GUSHR