Gaius-Augustus / BRAKER

BRAKER is a pipeline for fully automated prediction of protein coding gene structures with GeneMark-ES/ET/EP/ETP and AUGUSTUS in novel eukaryotic genomes
Other
350 stars 79 forks source link

Cannot create temporary file and failed to exectute gmes_petap.pl #724

Closed dmacguigan closed 9 months ago

dmacguigan commented 9 months ago

Hello,

I am running BRAKER 3.0.6 on a HPC cluster using the Singularity image file. I am using only protein evidence. Here is how I am calling BRAKER.

BRAKER_SIF="/projects/academic/tkrabben/software/BRAKER3/braker3.0.6.sif" # location of BRAKER singularity image
SPECIES="Mmel"
AUGUSTUS_SPECIES_NAME="Minytrema_melanops"
ANNOTATION_DIR="/projects/academic/tkrabben/MacGuigan/genome_annotations/Mmel_annotation"
GENOME_DIR="Mmel_assembly"
MASKED_GENOME_FILE="Mmel_10kbp20kbp_flye_Medaka_purgehap_YaHS_Chromosome_TGS_gapCloser.masked.fasta"
PROT_FASTA="/projects/academic/tkrabben/MacGuigan/genome_annotations/Mmel_annotation/OrthoDB11/OrthoDBVert.NCBI.proteins.fasta"
BRAKER_THREADS=20

singularity run -H ${PWD} ${BRAKER_SIF} braker.pl --genome=${GENOME_DIR}/${MASKED_GENOME_FILE} --prot_seq=${PROT_FASTA} --species=${AUGUSTUS_SPECIES_NAME} --workingdir=${PWD}/${SPECIES}_BRAKER --GENEMARK_PATH=${ETP}/gmes --threads ${BRAKER_THREADS} --gff3

However, I encountered the following error message. The directory /scratch/14487641 does not exist on my system.

...
[Thu Dec 14 20:35:51 2023] Finished spliced alignment
[Thu Dec 14 20:35:53 2023] Flagging top chains
[Thu Dec 14 20:37:03 2023] Processing the output
[Thu Dec 14 20:40:03 2023] Output processed
[Thu Dec 14 20:40:04 2023] ProtHint finished.
sort: cannot create temporary file in '/scratch/14487641': No such file or directory
ERROR in file /opt/BRAKER/scripts/braker.pl at line 5302
Failed to execute: /usr/bin/perl /opt/ETP/bin/gmes/gmes_petap.pl --verbose --seq /projects/academic/tkrabben/MacGuigan/genome_annotations/Mmel_annotation/Mmel_BRAKER/genome.fa --EP /projects/academic/tkrabben/MacGuigan/genome_annotations/Mmel_annotation/Mmel_BRAKER/genemark_hintsfile.gff --cores=20  --gc_donor 0.001 --evidence /projects/academic/tkrabben/MacGuigan/genome_annotations/Mmel_annotation/Mmel_BRAKER/genemark_evidence.gff  --soft_mask auto 1>/projects/academic/tkrabben/MacGuigan/genome_annotations/Mmel_annotation/Mmel_BRAKER/GeneMark-EP.stdout 2>/projects/academic/tkrabben/MacGuigan/genome_annotations/Mmel_annotation/Mmel_BRAKER/errors/GeneMark-EP.stderr
Failed to execute: /usr/bin/perl /opt/ETP/bin/gmes/gmes_petap.pl --verbose --seq /projects/academic/tkrabben/MacGuigan/genome_annotations/Mmel_annotation/Mmel_BRAKER/genome.fa --EP /projects/academic/tkrabben/MacGuigan/genome_annotations/Mmel_annotation/Mmel_BRAKER/genemark_hintsfile.gff --cores=20  --gc_donor 0.001 --evidence /projects/academic/tkrabben/MacGuigan/genome_annotations/Mmel_annotation/Mmel_BRAKER/genemark_evidence.gff  --soft_mask auto 1>/projects/academic/tkrabben/MacGuigan/genome_annotations/Mmel_annotation/Mmel_BRAKER/GeneMark-EP.stdout 2>/projects/academic/tkrabben/MacGuigan/genome_annotations/Mmel_annotation/Mmel_BRAKER/errors/GeneMark-EP.stderr !

All of the BRAKER error files are empty:

dmacguig@vortex-future:~/project/MacGuigan/genome_annotations/Mmel_annotation/Mmel_BRAKER/errors$ ls -l
total 0
-rw-rw-r-- 1 dmacguig dmacguig 0 Dec 14 11:03 find_python3_biopython.err
-rw-rw-r-- 1 dmacguig dmacguig 0 Dec 14 11:03 find_python3_re.err
-rw-rw-r-- 1 dmacguig dmacguig 0 Dec 14 11:03 gc_content.stderr
-rw-rw-r-- 1 dmacguig dmacguig 0 Dec 14 20:40 GeneMark-EP.stderr
-rw-rw-r-- 1 dmacguig dmacguig 0 Dec 14 11:08 GeneMark-ES.stderr
-rw-rw-r-- 1 dmacguig dmacguig 0 Dec 14 11:05 new_species.stderr

But GeneMark-EP.stdout says the following:

dmacguig@vortex-future:/projects/academic/tkrabben/MacGuigan/genome_annotations/Mmel_annotation/Mmel_BRAKER$ cat GeneMark-EP.stdout
# check before the run
# hard_mask is in the 'auto' mode. hard_mask was set to: 100
# creat directories
# commit input data
error, output file is empty data/ep.gff
error on call: /opt/ETP/bin/gmes/reformat_gff.pl --out data/ep.gff  --trace info/dna.trace  --in /projects/academic/tkrabben/MacGuigan/genome_annotations/Mmel_annotation/Mmel_BRAKER/genemark_hintsfile.gff  --quiet

Any idea what might be happening? Happy to provide more info if needed.

Thanks for your help!

dmacguigan commented 9 months ago

Also, here is the tail end of GeneMark-ES.stdout:

dmacguig@vortex-future:~/project/MacGuigan/genome_annotations/Mmel_annotation/Mmel_BRAKER$ tail -n 20 GeneMark-ES.stdout
44, -6.93762609911524
45, -6.97399374328611
46, -7.70488125182891
47, -7.01173407126896
48, -7.325391630124
49, -7.63077327967518
51, -7.17878815593213
52, -7.78492395950244
55, -7.70488125182891
56, -7.9672455162964
57, -7.63077327967518
59, -7.9672455162964
# estimate 1: 0.911845177903901
# estimate 2: 0.900180380875777
# value: 0.900180380875777
IvsT  0.900180380875777 vs 0.0998196191242225
# predict final gene set
running gm.hmm on local multi-core system
3858 contigs in list
3858 contigs in list

And the tail of braker.log:

dmacguig@vortex-future:~/project/MacGuigan/genome_annotations/Mmel_annotation/Mmel_BRAKER$ tail braker.log 
# WARNING: 
# The hints file(s) for GeneMark-EX contain less than 150 introns with multiplicity >= 4! (In total, 0 unique introns are contained. 0 have a multiplicity >= 4.)
# Possibly, you are trying to run braker.pl on data that does not provide sufficient multiplicity information. This will e.g. happen if you try to use introns generated from assembled RNA-Seq transcripts; or if you try to run braker.pl in epmode with mappings from proteins without sufficient hits per locus. Or if you use the example data set.
# A low number of intron hints with sufficient multiplicity may result in a crash of GeneMark-EX (it should not crash with the example data set).
#*********
# Thu Dec 14 20:40:07 2023: Running GeneMark-EP
# Thu Dec 14 20:40:07 2023: changing into GeneMark-EP directory /projects/academic/tkrabben/MacGuigan/genome_annotations/Mmel_annotation/Mmel_BRAKER/GeneMark-EP
cd /projects/academic/tkrabben/MacGuigan/genome_annotations/Mmel_annotation/Mmel_BRAKER/GeneMark-EP
# Thu Dec 14 20:40:07 2023: Running gmes_petap.pl
/usr/bin/perl /opt/ETP/bin/gmes/gmes_petap.pl --verbose --seq /projects/academic/tkrabben/MacGuigan/genome_annotations/Mmel_annotation/Mmel_BRAKER/genome.fa --EP /projects/academic/tkrabben/MacGuigan/genome_annotations/Mmel_annotation/Mmel_BRAKER/genemark_hintsfile.gff --cores=20  --gc_donor 0.001 --evidence /projects/academic/tkrabben/MacGuigan/genome_annotations/Mmel_annotation/Mmel_BRAKER/genemark_evidence.gff  --soft_mask auto 1>/projects/academic/tkrabben/MacGuigan/genome_annotations/Mmel_annotation/Mmel_BRAKER/GeneMark-EP.stdout 2>/projects/academic/tkrabben/MacGuigan/genome_annotations/Mmel_annotation/Mmel_BRAKER/errors/GeneMark-EP.stderr
dmacguigan commented 9 months ago

I seem to have found a solution. Based on this post, I added the following line to my script before calling the BRAKER Singularity image file.

export TMPDIR=/tmp/

BRAKER now finishes without error.

dmacguig@vortex-future:~/project/MacGuigan/genome_annotations/Mmel_annotation/Mmel_BRAKER_test$ tail -n 20 braker.log
Deleting file /projects/academic/tkrabben/MacGuigan/genome_annotations/Mmel_annotation/Mmel_BRAKER_test/prevHints.gff
Deleting file /projects/academic/tkrabben/MacGuigan/genome_annotations/Mmel_annotation/Mmel_BRAKER_test/proteins.fa
Deleting file /projects/academic/tkrabben/MacGuigan/genome_annotations/Mmel_annotation/Mmel_BRAKER_test/seed_proteins.faa
Deleting file /projects/academic/tkrabben/MacGuigan/genome_annotations/Mmel_annotation/Mmel_BRAKER_test/uniqueSeeds.gtf
Deleting file /projects/academic/tkrabben/MacGuigan/genome_annotations/Mmel_annotation/Mmel_BRAKER_test/braker.gtf_temp
Deleting file /projects/academic/tkrabben/MacGuigan/genome_annotations/Mmel_annotation/Mmel_BRAKER_test/genemark_hintsfile.gff
Deleting file /projects/academic/tkrabben/MacGuigan/genome_annotations/Mmel_annotation/Mmel_BRAKER_test/gc_content.out
Deleting file /projects/academic/tkrabben/MacGuigan/genome_annotations/Mmel_annotation/Mmel_BRAKER_test/evidence.gff
Deleting file /projects/academic/tkrabben/MacGuigan/genome_annotations/Mmel_annotation/Mmel_BRAKER_test/genemark_evidence.gff
Deleting directory /projects/academic/tkrabben/MacGuigan/genome_annotations/Mmel_annotation/Mmel_BRAKER_test/GeneMark-ES/data
Deleting directory /projects/academic/tkrabben/MacGuigan/genome_annotations/Mmel_annotation/Mmel_BRAKER_test/GeneMark-ES/info
Deleting directory /projects/academic/tkrabben/MacGuigan/genome_annotations/Mmel_annotation/Mmel_BRAKER_test/GeneMark-ES/output
Deleting directory /projects/academic/tkrabben/MacGuigan/genome_annotations/Mmel_annotation/Mmel_BRAKER_test/GeneMark-ES/run
Deleting directory /projects/academic/tkrabben/MacGuigan/genome_annotations/Mmel_annotation/Mmel_BRAKER_test/GeneMark-EP/data
Deleting directory /projects/academic/tkrabben/MacGuigan/genome_annotations/Mmel_annotation/Mmel_BRAKER_test/GeneMark-EP/info
Deleting directory /projects/academic/tkrabben/MacGuigan/genome_annotations/Mmel_annotation/Mmel_BRAKER_test/GeneMark-EP/output
Deleting directory /projects/academic/tkrabben/MacGuigan/genome_annotations/Mmel_annotation/Mmel_BRAKER_test/GeneMark-EP/run
#**********************************************************************************
#                               BRAKER RUN FINISHED                                
#**********************************************************************************
KatharinaHoff commented 9 months ago

Thank you for reporting this fix. I will add it to the FAQ.