Gaius-Augustus / BRAKER

BRAKER is a pipeline for fully automated prediction of protein coding gene structures with GeneMark-ES/ET/EP/ETP and AUGUSTUS in novel eukaryotic genomes
Other
348 stars 79 forks source link

Failed to execute: /opt/Augustus/bin//augustus ExonModel: Couldn't open file *_exon_probs.pbl #852

Open kwiyounghan opened 3 weeks ago

kwiyounghan commented 3 weeks ago

Hi BRAKER developers,

I've been using BRAKER3 with RNA-seq data produced from VARUS, on our university HPC using singularity. The first step of GeneMark-ET runs fine then I get an error when training Augustus.

ERROR in file /opt/BRAKER/scripts/braker.pl at line 7971
Failed to execute: /opt/Augustus/bin//augustus --species=gadusMorhua --AUGUSTUS_CONFIG_PATH=/gxfs_home/geomar/smomw426/.augustus --extrinsicCfgFile=/opt/
BRAKER/scripts/cfg/rnaseq.cfg --alternatives-from-evidence=true --hintsfile=/gxfs_work/geomar/smomw426/cod_ref/05_braker/braker/hintsfile.gff --UTR=off -
-exonnames=on --codingseq=on --allow_hinted_splicesites=gcag,atac --softmasking=1 /gxfs_work/geomar/smomw426/cod_ref/05_braker/braker/genome.fa 1>/gxfs_w
ork/geomar/smomw426/cod_ref/05_braker/braker/augustus.hints.gff 2>/gxfs_work/geomar/smomw426/cod_ref/05_braker/braker/errors/augustus.hints.stderr!

Then in the augustus.hints.stderr file the error message goes :

/opt/Augustus/bin//augustus: ERROR
    ExonModel: Couldn't open file /gxfs_home/geomar/smomw426/.augustus/species/gadusMorhua/gadusMorhua_exon_probs.pbl

Here, of course Augustus cannot access the species/folder, as for some reason the working path changes to /gxfs_**home**/geomar/smomw426/.... instead of /gxfs_**work**/geomar/smomw426/.... which I bind in the singularity command and where all the output and temporary files are saved. My command goes like this :

singularity exec --bind /gxfs_work/geomar/smomw426/ $BRAKER_SIF braker.pl --species=gadusMorhua --genome=$GENOME \
       --bam=$RNA_bam --useexisting --geneMarkGtf ${BRAKER_OLD}/GeneMark*/genemark.gtf
#or without the --geneMarkGtf option

How can I control this behavior? or Is there a workaround to fix this issue?

thanks for your support, Kwi

LarsGab commented 2 weeks ago

Hello Kiwi,

thank you for using BRAKER.

I recommend that you create a large FASTA file with protein sequences of related species before running BRAKER. For example, the protein sequences of the clade of your target species from OrthoDB. Afterward, run BRAKER with protein and RNA-seq data. If you run BRAKER with RNA-Seq data only, you are executing the BRAKER1 protocol, which is significantly less accurate.

Regarding your issue, it seems BRAKER might be looking in the wrong directory, possibly due to the AUGUSTUS variable being set incorrectly. I suggest running BRAKER with the --AUGUSTUS_CONFIG_PATH option with the location of your species directory within the AUGUSTUS directory. I think in your case it should be --AUGUSTUS_CONFIG_PATH /gxfs_work/geomar/smomw426/.augustus/config/. Please confirm this path is correct.

Afterward, try running BRAKER without the --geneMarkGtf and --useexisting options.

Best, Lars

kwiyounghan commented 1 week ago

Hi Lars,

Thanks a lot for the suggestion. I added the protein sequences accordingly and running in the ETP mode now.

For the issue, I realized that AUGUSTUS executable in the $HOME directory not $WORK. I do not know how to control this when installing through singularity. So I circumvented the problem by binding the $HOME directory together with $WORK when executing braker through singularity. And it seems like it works.

Thanks again for your support. Kwi