nextgenusfs / funannotate

Eukaryotic Genome Annotation Pipeline
http://funannotate.readthedocs.io
BSD 2-Clause "Simplified" License
322 stars 85 forks source link

PASA fails on test run #778

Closed dmacguigan closed 2 years ago

dmacguigan commented 2 years ago

Are you using the latest release? Yes

Describe the bug I am using the latest Docker image for funannotate (with Singularity) on a university computing cluster. However, PASA throws the following error message. I have tried the test dataset and my own data, but the issue persists.

From funannotate-train.log:

CMD ERROR: /venv/opt/pasa-2.4.1/Launch_PASA_pipeline.pl -c /user/dmacguig/project/MacGuigan/funannotate_test/test-rna_seq_6223445e-e558-4795-b5e8-e99bbd2c734e/rna-seq/training/pasa/alignAssembly.txt -r -C -R -g /user/dmacguig/project/MacGuigan/funannotate_test/test-rna_seq_6223445e-e558-4795-b5e8-e99bbd2c734e/rna-seq/training/genome.fasta --IMPORT_CUSTOM_ALIGNMENTS /user/dmacguig/project/MacGuigan/funannotate_test/test-rna_seq_6223445e-e558-4795-b5e8-e99bbd2c734e/rna-seq/training/trinity.alignments.gff3 -T -t /user/dmacguig/project/MacGuigan/funannotate_test/test-rna_seq_6223445e-e558-4795-b5e8-e99bbd2c734e/rna-seq/training/trinity.long-reads.fasta.clean -u /user/dmacguig/project/MacGuigan/funannotate_test/test-rna_seq_6223445e-e558-4795-b5e8-e99bbd2c734e/rna-seq/training/trinity.long-reads.fasta --stringent_alignment_overlap 30.0 --TRANSDECODER --ALT_SPLICE --MAX_INTRON_LENGTH 3000 --CPU 4 --ALIGNERS blat --trans_gtf /user/dmacguig/project/MacGuigan/funannotate_test/test-rna_seq_6223445e-e558-4795-b5e8-e99bbd2c734e/rna-seq/training/funannotate_train.stringtie.gtf

From pasa-assembly.log

* [Sat Sep  3 00:34:42 2022] Running CMD: /venv/opt/pasa-2.4.1/scripts/assemble_clusters.dbi -G /user/dmacguig/project/MacGuigan/funannotate_test/test-rna_seq_6223445e-e558-4795-b5e8-e99bbd2c734e/rna-seq/training/genome.fasta  -M '/user/dmacguig/project/MacGuigan/funannotate_test/test-rna_seq_6223445e-e558-4795-b5e8-e99bbd2c734e/rna-seq/training/pasa/Awesome_rna_pasa'  -T 4  > Awesome_rna_pasa.pasa_alignment_assembly_building.ascii_illustrations.out
Thread 2 terminated abnormally: Can't open file /scratch/9342753/pasa.1662179682-0.152672105775313.+.in at /venv/opt/pasa-2.4.1/PerlLib/CDNA/PASA_alignment_assembler.pm line 232, <$fh> line 3433.
Thread 1 terminated abnormally: Can't open file /scratch/9342753/pasa.1662179683-0.287152186109783.+.in at /venv/opt/pasa-2.4.1/PerlLib/CDNA/PASA_alignment_assembler.pm line 232, <$fh> line 7222.
Thread 5 terminated abnormally: Can't open file /scratch/9342753/pasa.1662179683-0.225862968786718.+.in at /venv/opt/pasa-2.4.1/PerlLib/CDNA/PASA_alignment_assembler.pm line 232, <$fh> line 5624.
Thread 4 terminated abnormally: Can't open file /scratch/9342753/pasa.1662179683-0.0356843572381003.+.in at /venv/opt/pasa-2.4.1/PerlLib/CDNA/PASA_alignment_assembler.pm line 232, <$fh> line 7203.
Thread 3 terminated abnormally: Can't open file /scratch/9342753/pasa.1662179683-0.0467865295513583.+.in at /venv/opt/pasa-2.4.1/PerlLib/CDNA/PASA_alignment_assembler.pm line 232, <$fh> line 14155.
ERROR, thread 1 exited with error Can't open file /scratch/9342753/pasa.1662179683-0.287152186109783.+.in at /venv/opt/pasa-2.4.1/PerlLib/CDNA/PASA_alignment_assembler.pm line 232, <$fh> line 7222.

ERROR, thread 2 exited with error Can't open file /scratch/9342753/pasa.1662179682-0.152672105775313.+.in at /venv/opt/pasa-2.4.1/PerlLib/CDNA/PASA_alignment_assembler.pm line 232, <$fh> line 3433.

ERROR, thread 3 exited with error Can't open file /scratch/9342753/pasa.1662179683-0.0467865295513583.+.in at /venv/opt/pasa-2.4.1/PerlLib/CDNA/PASA_alignment_assembler.pm line 232, <$fh> line 14155.

ERROR, thread 4 exited with error Can't open file /scratch/9342753/pasa.1662179683-0.0356843572381003.+.in at /venv/opt/pasa-2.4.1/PerlLib/CDNA/PASA_alignment_assembler.pm line 232, <$fh> line 7203.

ERROR, thread 5 exited with error Can't open file /scratch/9342753/pasa.1662179683-0.225862968786718.+.in at /venv/opt/pasa-2.4.1/PerlLib/CDNA/PASA_alignment_assembler.pm line 232, <$fh> line 5624.

Thread 6 terminated abnormally: Can't open file /scratch/9342753/pasa.1662179683-0.611531752468185.+.in at /venv/opt/pasa-2.4.1/PerlLib/CDNA/PASA_alignment_assembler.pm line 232, <$fh> line 9579.
ERROR, thread 6 exited with error Can't open file /scratch/9342753/pasa.1662179683-0.611531752468185.+.in at /venv/opt/pasa-2.4.1/PerlLib/CDNA/PASA_alignment_assembler.pm line 232, <$fh> line 9579.

Error, 6 threads failed.
Error, cmd: /venv/opt/pasa-2.4.1/scripts/assemble_clusters.dbi -G /user/dmacguig/project/MacGuigan/funannotate_test/test-rna_seq_6223445e-e558-4795-b5e8-e99bbd2c734e/rna-seq/training/genome.fasta  -M '/user/dmacguig/project/MacGuigan/funannotate_test/test-rna_seq_6223445e-e558-4795-b5e8-e99bbd2c734e/rna-seq/training/pasa/Awesome_rna_pasa'  -T 4  > Awesome_rna_pasa.pasa_alignment_assembly_building.ascii_illustrations.out died with ret 6400 No such file or directory at /venv/opt/pasa-2.4.1/PerlLib/Pipeliner.pm line 187.
        Pipeliner::run(Pipeliner=HASH(0x563050e310f8)) called at /venv/opt/pasa-2.4.1/Launch_PASA_pipeline.pl line 1044

What command did you issue?

singularity run -H ${PWD} /projects/academic/tkrabben/software/funannotate_1.8.13/funannotate_latest.sif funannotate test -t rna-seq --cpus 6

Logfiles See attached files. funannotate-train.log pasa-assembly.log

OS/Install Information

-------------------------------------------------------
Checking dependencies for 1.8.13
-------------------------------------------------------
You are running Python v 3.8.12. Now checking python packages...
biopython: 1.79
goatools: 1.2.3
matplotlib: 3.5.3
natsort: 8.1.0
numpy: 1.22.4
pandas: 1.4.3
psutil: 5.9.1
requests: 2.28.1
scikit-learn: 1.1.1
scipy: 1.5.3
seaborn: 0.11.2
All 11 python packages installed

You are running Perl v b'5.026002'. Now checking perl modules...
Carp: 1.38
Clone: 0.42
DBD::SQLite: 1.64
DBD::mysql: 4.046
DBI: 1.642
DB_File: 1.855
Data::Dumper: 2.173
File::Basename: 2.85
File::Which: 1.23
Getopt::Long: 2.5
Hash::Merge: 0.300
JSON: 4.02
LWP::UserAgent: 6.39
Logger::Simple: 2.0
POSIX: 1.76
Parallel::ForkManager: 2.02
Pod::Usage: 1.69
Scalar::Util::Numeric: 0.40
Storable: 3.15
Text::Soundex: 3.05
Thread::Queue: 3.12
Tie::File: 1.02
URI::Escape: 3.31
YAML: 1.29
local::lib: 2.000024
threads: 2.15
threads::shared: 1.56
All 27 Perl modules installed

Checking Environmental Variables...
$FUNANNOTATE_DB=/opt/databases
$PASAHOME=/venv/opt/pasa-2.4.1
$TRINITYHOME=/venv/opt/trinity-2.8.5
$EVM_HOME=/venv/opt/evidencemodeler-1.1.1
$AUGUSTUS_CONFIG_PATH=/usr/share/augustus/config
        ERROR: GENEMARK_PATH not set. export GENEMARK_PATH=/path/to/dir
-------------------------------------------------------
Checking external dependencies...
PASA: 2.4.1
CodingQuarry: 2.0
Trinity: 2.8.5
augustus: 3.3.2
bamtools: bamtools 2.5.2
bedtools: bedtools v2.30.0
blat: BLAT v35
diamond: 2.0.15
ete3: 3.1.2
exonerate: exonerate 2.4.0
fasta: no way to determine
glimmerhmm: 3.0.4
gmap: 2017-11-15
hisat2: 2.2.1
hmmscan: HMMER 3.3.2 (Nov 2020)
hmmsearch: HMMER 3.3.2 (Nov 2020)
java: 11.0.8-internal
kallisto: 0.46.1
mafft: v7.505 (2022/Apr/10)
makeblastdb: makeblastdb 2.2.31+
minimap2: 2.24-r1122
pigz: pigz 2.6
proteinortho: 6.0.16

pslCDnaFilter: no way to determine
salmon: salmon 0.14.1
samtools: samtools 1.12
snap: 2006-07-28
stringtie: 2.2.1
tRNAscan-SE: 2.0.9 (July 2021)
tantan: tantan 39
tbl2asn: no way to determine, likely 25.X
tblastn: tblastn 2.2.31+
trimal: trimAl v1.4.rev15 build[2013-12-17]
trimmomatic: 0.39
        ERROR: emapper.py not installed
        ERROR: gmes_petap.pl not installed
        ERROR: signalp not installed
hyphaltip commented 2 years ago

and you CAN write to /scratch/$JOBID (/scratch/9342753/ in this example)?

Did you have the scratch folder set in the SINGULARITY_BIND env variable before starting up the process? eg on our system we need to bind the partitions to be able to write to.

# in this example the $SCRATCH variable is set in the slurm env to a local disk folder that is deleted after the job completes
export SINGULARITY_BIND="${SCRATCH}:/tmp"
dmacguigan commented 2 years ago

Ah it seems my problem stems from how SLURM sets environmental variables.

From https://ubccr.freshdesk.com/support/solutions/articles/13000065629-singularity-slurm-error-mktemp-failed-to-create-file-via-template

Users submitting job scripts with sbatch that run Singularity containers may notice this error:

mktemp: failed to create file via template ‘/scratch/[jobid]/tmp.XXXXXXXXXX’: No such file or directory

With sbatch, the environment variable TMPDIR gets set to be /scratch/[jobid], so when mktemp tries to execute, it looks for /scratch/[jobid], but the singularity container may not have a /scratch directory.

Recommended solution:

At the beginning of the job script, unset TMPDIR to clear that variable, which allows the Singularity container to use whatever directory is the default for it.