pirovc / ganon

ganon2 classifies genomic sequences against large sets of references efficiently, with integrated download and update of databases (refseq/genbank), taxonomic profiling (ncbi/gtdb), binning and hierarchical classification, customized reporting and more
https://pirovc.github.io/ganon/
MIT License
86 stars 13 forks source link

Ganon classify: Error code -11 #272

Closed ericvdtoorn closed 9 months ago

ericvdtoorn commented 9 months ago

Running ganon classify with a custom DB (HumGut) on a paired sample, I get the following output:

> ganon classify --db-prefix /home/holo/tools/databases/humgut/humgut_ganon/HumGut_gtdb -p data/PRJEB27928/ERR2726507/ERR2726507_1.fastp.depleted.fq.gz data/PRJEB27928/ERR2726507/ERR2726507_2.fastp.depleted.fq.gz -o data/PRJEB27928/ERR2726507/ERR2726507.ganon --threads 16 

- - - - - - - - - -
   _  _  _  _  _
  (_|(_|| |(_)| |
   _|   v. 2.0.0
- - - - - - - - - -
Classifying reads
The following command failed to run:
/projects/PAPERS/benchmark/.snakemake/conda/312d7c098affca9126d9637aa078769c_/bin/ganon-classify  --paired-reads data/PRJEB27928/ERR2726507/ERR2726507_1
.fastp.depleted.fq.gz,data/PRJEB27928/ERR2726507/ERR2726507_2.fastp.depleted.fq.gz --ibf /home/holo/tools/databases/humgut/humgut_ganon/HumGut_gtdb.ibf
--tax /home/holo/tools/databases/humgut/humgut_ganon/HumGut_gtdb.tax  --rel-cutoff 0.75 --rel-filter 0.1 --fpr-query 1e-05 --output-prefix data/PRJEB279
28/ERR2726507/ERR2726507.ganon --skip-lca  --output-all   --threads 16

Error code: -11

Does not occur for all samples, only some. Stats for this example:

> seqkit stats PRJEB27928/ERR2726507/ERR2726507_{1,2}.fastp.depleted.fq.gz
file                                                     format  type   num_seqs      sum_len  min_len  avg_len  max_len
PRJEB27928/ERR2726507/ERR2726507_1.fastp.depleted.fq.gz  FASTQ   DNA   7,889,633  805,955,602       31    102.2      113
PRJEB27928/ERR2726507/ERR2726507_2.fastp.depleted.fq.gz  FASTQ   DNA   7,888,954  821,768,055       16    104.2      113

EDIT: running with verbose did not change much:

- - - - - - - - - -
   _  _  _  _  _
  (_|(_|| |(_)| |
   _|   v. 2.0.0
- - - - - - - - - -
Classifying reads
----------------------------------------------------------------------
--paired-reads
                      data/PRJEB27928/ERR2726507/ERR2726507_1.fastp.depleted.fq.gz
                      data/PRJEB27928/ERR2726507/ERR2726507_2.fastp.depleted.fq.gz
--output-prefix       data/PRJEB27928/ERR2726507/ERR2726507.ganon
--output-lca          0
--output-all          1
--output-unclassified 0
--output-single       0
--hibf                0
--threads             16
--n-batches           1000
--n-reads             400
--skip-lca            1
--verbose             1
--quiet               0
----------------------------------------------------------------------
H1
--rel-filter 0.1
--fpr-query 1e-05
    /home/holo/tools/databases/humgut/humgut_ganon/HumGut_gtdb.ibf, /home/holo/tools/databases/humgut/humgut_ganon/HumGut_gtdb.tax --rel-cutoff 0.75
    Output files: data/PRJEB27928/ERR2726507/ERR2726507.ganon.rep, data/PRJEB27928/ERR2726507/ERR2726507.ganon.all
----------------------------------------------------------------------
The following command failed to run:
/projects/PAPERS/benchmark/.snakemake/conda/312d7c098affca9126d9637aa078769c_/bin/ganon-classify  --paired-reads data/PRJEB27928/ERR2726507/ERR2726507_1.fastp.depleted.fq.gz,data/PRJEB27928/ERR2726507/ERR2726507_2.fastp.depleted.fq.gz --ibf /home/holo/tools/databases/humgut/humgut_ganon/HumGut_gtdb.ibf --tax /home/holo/tools/databases/humgut/humgut_ganon/HumGut_gtdb.tax  --rel-cutoff 0.75 --rel-filter 0.1 --fpr-query 1e-05 --output-prefix data/PRJEB27928/ERR2726507/ERR2726507.ganon --skip-lca  --output-all   --threads 16   --verbose

Error code: -11
pirovc commented 9 months ago

From the stats you sent looks like your paired files do no have the same number of sequences, and that is probably the cause of the error. I downloaded the original ERR2726507 sequences and those worked just fine here, so probably the preprocessing step you are using is the issue.

ericvdtoorn commented 9 months ago

Alright, thanks for the quick help, I'll see whether that will fix the problem on my side, and reopen the issue if it remains 👍