sestaton / Transposome

A toolkit for annotation of transposable element families from unassembled sequence reads
http://sestaton.github.io/Transposome
MIT License
31 stars 6 forks source link

Error at the beginning of Transposome #43

Open KristinaGagalova opened 4 years ago

KristinaGagalova commented 4 years ago

Hi Evan,

I have started to run Transposome and I have the following error

INFO - ======== Transposome version: 0.12.1 (started at: 16-07-2020 12:13:32) ========
INFO - Configuration - Log file for monitoring progress and errors: t_log.txt
INFO - Configuration - Sequence file:                               wpw_interl.fq.gz
INFO - Configuration - Sequence format:                             fastq
INFO - Configuration - Sequence number for each BLAST process:      50000
INFO - Configuration - Number of CPUs per thread:                   1
INFO - Configuration - Number of threads:                           24
INFO - Configuration - Output directory:                            transposome_results_out
INFO - Configuration - In-memory analysis:                          1
INFO - Configuration - Percent identity for matches:                90
INFO - Configuration - Fraction coverage for pairwise matches:      0.55
INFO - Configuration - Merge threshold for clusters:                0.001
INFO - Configuration - Minimum cluster size for annotation:         100
INFO - Configuration - BLAST e-value threshold for annotation:      10
INFO - Configuration - Repeat database for annotation:              wpw.purged-70.tigmint.arks.sealer.fa.mod.EDTA.TElibRepBase.fa
INFO - Configuration - Log file for clustering/merging results:     t_cluster_report.txt
INFO - Transposome::Run::Blast::run_allvall_blast started at:   16-07-2020 12:13:33.
Error in tempfile() using template /projects/btl_scratch/kgagalova/Annotation/White_pine_weevil/GenomeArticle/Repeats/Wpw/Transposome/WpwReads/transposome_results_out/wpw_interl.fq_8187_XXXX.fasta: Could not create temp file /projects/btl_scratch/kgagalova/Annotation/White_pine_weevil/GenomeArticle/Repeats/Wpw/Transposome/WpwReads/transposome_results_out/wpw_interl.fq_8187_Pgdi.fasta: Too many open files at /home/kgagalova/perl5/lib/perl5/Transposome/Run/Blast.pm line 428.
"transposome-bl" unexpectedly returned exit value 24 at /home/kgagalova/perl5/bin/transposome line 142.

Is that because the file is too large? Should I subsample it? Also - I specified the file format is fastq, is that an accepted format or I need to convert it to fasta? Thank tou in advance

sestaton commented 4 years ago

Hi Kristina,

The format is fine and gzipped is also okay. The 'too many open files' error is a system error. In this case it may be because the program is creating too many subsets for the BLAST processes. I've never seen this for transposome but I'm guessing there are tens or hundreds of millions of sequences in the input file?

This program only requires very low coverage, so I would subsample to 100k read pairs and try again. You can add more reads if that works, and I recommend taking multiple samples at a given level of coverage to ensure you are not grabbing an odd sample by chance.