nf-core / mag

Assembly and binning of metagenomes
https://nf-co.re/mag
MIT License
192 stars 102 forks source link

NFCORE_MAG:MAG:GTDBTK:GTDBTK_CLASSIFYWF fails when all genomes classified by ANI screening step #598

Closed cedwardson4 closed 1 month ago

cedwardson4 commented 4 months ago

Description of the bug

Process NFCORE_MAG:MAG:GTDBTK:GTDBTK_CLASSIFYWF (SPAdes-MaxBin2-prokarya-unrefined-FF07295009) terminated with an error exit status (1).

Command "mv identify/ .", which errors with: "mv: can't rename 'identify/': No such file or directory"

I believe this is due to all genomes in the set classified with ANI pre-screening and thus no Identify or Align steps are performed, leading to no folder creation for those steps.

[2024-02-29 15:58:13] INFO: 18 genome(s) have been classified using the ANI pre-screening step. [2024-02-29 15:58:13] INFO: Done. [2024-02-29 15:58:13] INFO: All genomes have been classified by the ANI screening step, Identify and Align steps will be skipped.

Command used and terminal output

nextflow run nf-core/mag -r 2.5.4 -profile docker --input '/extra/WildR_TNX/1_INPUT_fastq_deinterleaved/*_R{1,2}.fq.gz' --outdir /extra/WildR_TNX/WildR_TNX_nfc-mag_out_individual_assembly \
    --binning_map_mode own \
    --max_cpus 8 \
    --max_memory 60GB \
    --host_fasta /extra/ref_files/GRCm39/fasta/mm39.fa \
    --centrifuge_db /extra/ref_files/centrifuge/indices/hpvc/ \
    --kraken2_db /extra/ref_files/kraken_db/k2_pluspfp_16gb_20231009/ \
    --gtdb_db /home/christian/miniconda3/envs/gtdbtk-2.3.0/share/gtdbtk-2.3.0/db/ \
    --gtdb_mash /home/christian/gtdbtk_2.3v214_mash \
    --bin_domain_classification \
    --checkm_db /home/christian/checkm_data/ \
    --refine_bins_dastool \
    --binqc_tool checkm \
    --postbinning_input both \
    --run_gunc \
    --gunc_db /extra/ref_files/gunc/gunc_db_progenomes2.1.dmnd \
    --krona_db /home/christian/Krona-2.8.1/KronaTools/taxonomy/taxonomy.tab \
    --skip_concoct \
    --skip_prokka \
    -resume

[2e/287f82] process > NFCORE_MAG:MAG:FASTQC_RAW (FF07295009_run0_raw)                                                [100%] 8 of 8, cached: 8 ✔
[3d/8a90ac] process > NFCORE_MAG:MAG:FASTP (FF07295042_run0)                                                         [100%] 8 of 8, cached: 8 ✔
[4e/7a1008] process > NFCORE_MAG:MAG:BOWTIE2_HOST_REMOVAL_BUILD (mm39.fa)                                            [100%] 1 of 1, cached: 1 ✔
[42/7fcae8] process > NFCORE_MAG:MAG:BOWTIE2_HOST_REMOVAL_ALIGN (FF07295186_run0)                                    [100%] 8 of 8, cached: 8 ✔
[ed/c14b4b] process > NFCORE_MAG:MAG:BOWTIE2_PHIX_REMOVAL_BUILD (GCA_002596845.1_ASM259684v1_genomic.fna.gz)         [100%] 1 of 1, cached: 1 ✔
[e7/f206e3] process > NFCORE_MAG:MAG:BOWTIE2_PHIX_REMOVAL_ALIGN (FF07295926_run0)                                    [100%] 8 of 8, cached: 8 ✔
[9e/50afa8] process > NFCORE_MAG:MAG:FASTQC_TRIMMED (FF07295926_run0)                                                [100%] 8 of 8, cached: 8 ✔
[-        ] process > NFCORE_MAG:MAG:CAT_FASTQ                                                                       -
[-        ] process > NFCORE_MAG:MAG:NANOPLOT_RAW                                                                    -
[-        ] process > NFCORE_MAG:MAG:PORECHOP                                                                        -
[-        ] process > NFCORE_MAG:MAG:NANOLYSE                                                                        -
[-        ] process > NFCORE_MAG:MAG:FILTLONG                                                                        -
[-        ] process > NFCORE_MAG:MAG:NANOPLOT_FILTERED                                                               -
[cc/dbc5c0] process > NFCORE_MAG:MAG:CENTRIFUGE (FF07295926-hpvc)                                                    [100%] 8 of 8, cached: 8 ✔
[a0/5677db] process > NFCORE_MAG:MAG:KRAKEN2 (FF07295042-k2_pluspfp_16gb_20231009)                                   [100%] 8 of 8, cached: 8 ✔
[d3/8c86fb] process > NFCORE_MAG:MAG:KRONA (kraken2-FF07295186)                                                      [100%] 16 of 16, cached: 16 ✔
[77/87d73d] process > NFCORE_MAG:MAG:MEGAHIT (FF07295042)                                                            [100%] 8 of 8, cached: 8 ✔
[-        ] process > NFCORE_MAG:MAG:POOL_LONG_READS                                                                 -
[e8/7ef2fa] process > NFCORE_MAG:MAG:SPADES (FF07295042)                                                             [100%] 8 of 8, cached: 8 ✔
[-        ] process > NFCORE_MAG:MAG:SPADESHYBRID                                                                    -
[e8/4117ef] process > NFCORE_MAG:MAG:QUAST (SPAdes-FF07295173)                                                       [100%] 16 of 16, cached: 16 ✔
[47/d913c2] process > NFCORE_MAG:MAG:PRODIGAL (FF07295009)                                                           [100%] 16 of 16, cached: 16 ✔
[5a/8061c2] process > NFCORE_MAG:MAG:BINNING_PREPARATION:BOWTIE2_ASSEMBLY_BUILD (SPAdes-FF07295926)                  [100%] 16 of 16, cached: 16 ✔
[ed/caec40] process > NFCORE_MAG:MAG:BINNING_PREPARATION:BOWTIE2_ASSEMBLY_ALIGN (SPAdes-FF07295926-FF07295926)       [100%] 16 of 16, cached: 16 ✔
[72/dc0a2b] process > NFCORE_MAG:MAG:BINNING:METABAT2_JGISUMMARIZEBAMCONTIGDEPTHS (FF07295042)                       [100%] 16 of 16, cached: 16 ✔
[26/e5c9c3] process > NFCORE_MAG:MAG:BINNING:CONVERT_DEPTHS (FF07295926)                                             [100%] 16 of 16, cached: 16 ✔
[69/632a03] process > NFCORE_MAG:MAG:BINNING:METABAT2_METABAT2 (FF07294962)                                          [100%] 16 of 16, cached: 16 ✔
[6f/91ad39] process > NFCORE_MAG:MAG:BINNING:MAXBIN2 (FF07295926)                                                    [100%] 16 of 16, cached: 16 ✔
[80/24d469] process > NFCORE_MAG:MAG:BINNING:ADJUST_MAXBIN2_EXT (SPAdes-FF07295926)                                  [100%] 16 of 16, cached: 16 ✔
[57/fa0099] process > NFCORE_MAG:MAG:BINNING:SPLIT_FASTA (SPAdes-MaxBin2-FF07295926)                                 [100%] 32 of 32, cached: 32 ✔
[50/ee5802] process > NFCORE_MAG:MAG:BINNING:GUNZIP_BINS (SPAdes-MaxBin2-FF07295926.060.fa.gz)                       [100%] 2093 of 2093, cached: 2093 ✔
[-        ] process > NFCORE_MAG:MAG:BINNING:GUNZIP_UNBINS                                                           -
[98/936720] process > NFCORE_MAG:MAG:DOMAIN_CLASSIFICATION:TIARA:TIARA_TIARA (FF07295009)                            [100%] 16 of 16, cached: 16 ✔
[78/b0d665] process > NFCORE_MAG:MAG:DOMAIN_CLASSIFICATION:TIARA:DASTOOL_FASTATOCONTIG2BIN_TIARA (FF07294962)        [100%] 32 of 32, cached: 32 ✔
[3d/389384] process > NFCORE_MAG:MAG:DOMAIN_CLASSIFICATION:TIARA:TIARA_CLASSIFY (FF07295009)                         [100%] 32 of 32, cached: 32 ✔
[19/298a21] process > NFCORE_MAG:MAG:DOMAIN_CLASSIFICATION:TIARA:TIARA_SUMMARY                                       [100%] 1 of 1, cached: 1 ✔
[e5/c8ad96] process > NFCORE_MAG:MAG:BINNING_REFINEMENT:RENAME_PREDASTOOL (SPAdes-MaxBin2-FF07295042)                [100%] 32 of 32, cached: 32 ✔
[d0/364b9e] process > NFCORE_MAG:MAG:BINNING_REFINEMENT:DASTOOL_FASTATOCONTIG2BIN_METABAT2 (FF07295173)              [100%] 16 of 16, cached: 16 ✔
[6c/63b63d] process > NFCORE_MAG:MAG:BINNING_REFINEMENT:DASTOOL_FASTATOCONTIG2BIN_MAXBIN2 (FF07295042)               [100%] 16 of 16, cached: 16 ✔
[-        ] process > NFCORE_MAG:MAG:BINNING_REFINEMENT:DASTOOL_FASTATOCONTIG2BIN_CONCOCT                            -
[93/deee2b] process > NFCORE_MAG:MAG:BINNING_REFINEMENT:DASTOOL_DASTOOL (FF07295017)                                 [100%] 16 of 16, cached: 16 ✔
[4e/893128] process > NFCORE_MAG:MAG:BINNING_REFINEMENT:RENAME_POSTDASTOOL (MEGAHIT-FF07295017)                      [100%] 16 of 16, cached: 16 ✔
[b9/3c3f12] process > NFCORE_MAG:MAG:DEPTHS:MAG_DEPTHS (SPAdes-MetaBAT2-FF07295173)                                  [100%] 48 of 48, cached: 48 ✔
[-        ] process > NFCORE_MAG:MAG:DEPTHS:MAG_DEPTHS_PLOT                                                          -
[cd/c7d0ac] process > NFCORE_MAG:MAG:DEPTHS:MAG_DEPTHS_SUMMARY                                                       [100%] 1 of 1, cached: 1 ✔
[fa/af31ce] process > NFCORE_MAG:MAG:CHECKM_QC:CHECKM_LINEAGEWF (SPAdes-DASTool-prokarya-dastool_refined-FF07295042) [100%] 65 of 65, cached: 65 ✔
[ea/7f4c0f] process > NFCORE_MAG:MAG:CHECKM_QC:CHECKM_QA (FF07295042)                                                [100%] 65 of 65, cached: 65 ✔
[40/b714fa] process > NFCORE_MAG:MAG:CHECKM_QC:COMBINE_CHECKM_TSV                                                    [100%] 1 of 1, cached: 1 ✔
[d6/3b6c30] process > NFCORE_MAG:MAG:GUNC_QC:GUNC_RUN (FF07295017)                                                   [100%] 2539 of 2539, cached: 2539 ✔
[37/535f21] process > NFCORE_MAG:MAG:GUNC_QC:GUNC_MERGECHECKM (FF07295017)                                           [100%] 2539 of 2539, cached: 2539 ✔
[04/b7d1b4] process > NFCORE_MAG:MAG:QUAST_BINS (MEGAHIT-DASTool-prokarya-dastool_refined_unbinned-FF07295017)       [100%] 78 of 78, cached: 78 ✔
[89/6717d5] process > NFCORE_MAG:MAG:QUAST_BINS_SUMMARY                                                              [100%] 1 of 1, cached: 1 ✔
[-        ] process > NFCORE_MAG:MAG:CAT                                                                             -
[-        ] process > NFCORE_MAG:MAG:CAT_SUMMARY                                                                     -
[28/383e45] process > NFCORE_MAG:MAG:GTDBTK:GTDBTK_CLASSIFYWF (SPAdes-MaxBin2-prokarya-unrefined-FF07295009)         [100%] 1 of 1, failed: 1 ✘
[-        ] process > NFCORE_MAG:MAG:GTDBTK:GTDBTK_SUMMARY                                                           [  0%] 0 of 1
[-        ] process > NFCORE_MAG:MAG:BIN_SUMMARY                                                                     -
[-        ] process > NFCORE_MAG:MAG:CUSTOM_DUMPSOFTWAREVERSIONS                                                     -
[-        ] process > NFCORE_MAG:MAG:MULTIQC                                                                         -
Execution cancelled -- Finishing pending tasks before exit
-[nf-core/mag] Pipeline completed with errors-
ERROR ~ Error executing process > 'NFCORE_MAG:MAG:GTDBTK:GTDBTK_CLASSIFYWF (SPAdes-MaxBin2-prokarya-unrefined-FF07295009)'

Caused by:
  Process `NFCORE_MAG:MAG:GTDBTK:GTDBTK_CLASSIFYWF (SPAdes-MaxBin2-prokarya-unrefined-FF07295009)` terminated with an error exit status (1)

Command executed:

  export GTDBTK_DATA_PATH="${PWD}/database"
  if [ --scratch_dir pplacer_tmp != "" ] ; then
      mkdir pplacer_tmp
  fi

  gtdbtk classify_wf \
      --extension fa \
      --genome_dir bins \
      --prefix "gtdbtk.SPAdes-MaxBin2-prokarya-unrefined-FF07295009" \
      --out_dir "${PWD}" \
      --cpus 8 \
      --mash_db gtdbtk_2.3v214_mash \
      --scratch_dir pplacer_tmp \
      --min_perc_aa 10 \
      --min_af 0.65

  mv classify/* .

  mv identify/* .

  mv align/* .
  mv gtdbtk.log "gtdbtk.SPAdes-MaxBin2-prokarya-unrefined-FF07295009.log"

  mv gtdbtk.warnings.log "gtdbtk.SPAdes-MaxBin2-prokarya-unrefined-FF07295009.warnings.log"

  find -name "gtdbtk.SPAdes-MaxBin2-prokarya-unrefined-FF07295009.*.classify.tree" | xargs -r gzip # do not fail if .tree is missing

  cat <<-END_VERSIONS > versions.yml
  "NFCORE_MAG:MAG:GTDBTK:GTDBTK_CLASSIFYWF":
      gtdbtk: $(echo $(gtdbtk --version -v 2>&1) | sed "s/gtdbtk: version //; s/ Copyright.*//")
  END_VERSIONS

Command exit status:
  1

Command output:
  [2024-02-29 15:57:39] INFO: GTDB-Tk v2.3.2
  [2024-02-29 15:57:39] INFO: gtdbtk classify_wf --extension fa --genome_dir bins --prefix gtdbtk.SPAdes-MaxBin2-prokarya-unrefined-FF07295009 --out_dir /home/christian/work/28/383e4595b0622839718a3d01c47ca1 --cpus 8 --mash_db gtdbtk_2.3v214_mash --scratch_dir pplacer_tmp --min_perc_aa 10 --min_af 0.65
  [2024-02-29 15:57:39] INFO: Using GTDB-Tk reference data version r214: database
  [2024-02-29 15:57:40] INFO: Loading reference genomes.
  [2024-02-29 15:57:41] INFO: Using Mash version 2.3
  [2024-02-29 15:57:41] INFO: Creating Mash sketch file: classify/ani_screen/intermediate_results/mash/gtdbtk.SPAdes-MaxBin2-prokarya-unrefined-FF07295009.user_query_sketch.msh
  [2024-02-29 15:57:42] INFO: Completed 18 genomes in 0.65 seconds (27.78 genomes/second).
  [2024-02-29 15:57:42] INFO: Loading data from existing Mash sketch file: gtdbtk_2.3v214_mash/gtdb_ref_sketch.msh
  [2024-02-29 15:57:45] INFO: Calculating Mash distances.
  [2024-02-29 15:58:06] INFO: Calculating ANI with FastANI v1.32.
  [2024-02-29 15:58:12] INFO: Completed 44 comparisons in 5.86 seconds (7.51 comparisons/second).
  [2024-02-29 15:58:13] INFO: Summary of results saved to: classify/ani_screen/gtdbtk.SPAdes-MaxBin2-prokarya-unrefined-FF07295009.bac120.ani_summary.tsv
  [2024-02-29 15:58:13] INFO: 18 genome(s) have been classified using the ANI pre-screening step.
  [2024-02-29 15:58:13] INFO: Done.
  [2024-02-29 15:58:13] INFO: All genomes have been classified by the ANI screening step, Identify and Align steps will be skipped.
  [2024-02-29 15:58:14] INFO: Note that Tk classification mode is insufficient for publication of new taxonomic designations. New designations should be based on one or more de novo trees, an example of which can be produced by Tk in de novo mode.
  [2024-02-29 15:58:14] INFO: Done.
  [2024-02-29 15:58:14] INFO: Removing intermediate files.
  [2024-02-29 15:58:14] INFO: Intermediate files removed.
  [2024-02-29 15:58:14] INFO: Done.

Command error:
  ==> Processed 0/18 genomes (0%) |               | [?genome/s, ETA ?]
  ==> Processed 11/18 genomes (61%) |█████████▏     | [60.45genome/s, ETA 00:00]
  ==> Processed 18/18 genomes (100%) |███████████████| [35.92genome/s, ETA 00:00]

  ==> Processed 18/18 genomes (100%) |███████████████| [35.92genome/s, ETA 00:00]
                                                                                 [2024-02-29 15:57:42] INFO: Completed 18 genomes in 0.65 seconds (27.78 genomes/second).
  [2024-02-29 15:57:42] INFO: Loading data from existing Mash sketch file: gtdbtk_2.3v214_mash/gtdb_ref_sketch.msh
  [2024-02-29 15:57:45] INFO: Calculating Mash distances.
  [2024-02-29 15:58:06] INFO: Calculating ANI with FastANI v1.32.

  ==> Processed 0/44 comparisons (0%) |               | [?comparison/s, ETA ?]
  ==> Processed 1/44 comparisons (2%) |▎              | [ 1.00comparison/s, ETA 00:42]
  ==> Processed 3/44 comparisons (7%) |█              | [ 2.39comparison/s, ETA 00:17]
  ==> Processed 4/44 comparisons (9%) |█▎             | [ 2.96comparison/s, ETA 00:13]
  ==> Processed 5/44 comparisons (11%) |█▋             | [ 2.87comparison/s, ETA 00:13]
  ==> Processed 7/44 comparisons (16%) |██▍            | [ 3.91comparison/s, ETA 00:09]
  ==> Processed 11/44 comparisons (25%) |███▊           | [ 6.37comparison/s, ETA 00:05]
  ==> Processed 13/44 comparisons (30%) |████▍          | [ 6.53comparison/s, ETA 00:04]
  ==> Processed 15/44 comparisons (34%) |█████          | [ 6.37comparison/s, ETA 00:04]
  ==> Processed 18/44 comparisons (41%) |██████▏        | [ 7.11comparison/s, ETA 00:03]
  ==> Processed 20/44 comparisons (45%) |██████▊        | [ 6.96comparison/s, ETA 00:03]
  ==> Processed 22/44 comparisons (50%) |███████▌       | [ 7.19comparison/s, ETA 00:03]
  ==> Processed 24/44 comparisons (55%) |████████▏      | [ 7.47comparison/s, ETA 00:02]
  ==> Processed 26/44 comparisons (59%) |████████▊      | [ 7.31comparison/s, ETA 00:02]
  ==> Processed 28/44 comparisons (64%) |█████████▌     | [ 7.88comparison/s, ETA 00:02]
  ==> Processed 30/44 comparisons (68%) |██████████▏    | [ 7.60comparison/s, ETA 00:01]
  ==> Processed 31/44 comparisons (70%) |██████████▌    | [ 7.57comparison/s, ETA 00:01]
  ==> Processed 33/44 comparisons (75%) |███████████▎   | [ 7.59comparison/s, ETA 00:01]
  ==> Processed 35/44 comparisons (80%) |███████████▉   | [ 7.99comparison/s, ETA 00:01]
  ==> Processed 38/44 comparisons (86%) |████████████▉  | [ 8.98comparison/s, ETA 00:00]
  ==> Processed 40/44 comparisons (91%) |█████████████▋ | [ 9.21comparison/s, ETA 00:00]
  ==> Processed 42/44 comparisons (95%) |██████████████▎| [ 8.47comparison/s, ETA 00:00]
  ==> Processed 44/44 comparisons (100%) |███████████████| [ 8.90comparison/s, ETA 00:00]

  ==> Processed 44/44 comparisons (100%) |███████████████| [ 8.90comparison/s, ETA 00:00]

  [2024-02-29 15:58:12] INFO: Completed 44 comparisons in 5.86 seconds (7.51 comparisons/second).
  [2024-02-29 15:58:13] INFO: Summary of results saved to: classify/ani_screen/gtdbtk.SPAdes-MaxBin2-prokarya-unrefined-FF07295009.bac120.ani_summary.tsv
  [2024-02-29 15:58:13] INFO: 18 genome(s) have been classified using the ANI pre-screening step.
  [2024-02-29 15:58:13] INFO: Done.
  [2024-02-29 15:58:13] INFO: All genomes have been classified by the ANI screening step, Identify and Align steps will be skipped.
  [2024-02-29 15:58:14] INFO: Note that Tk classification mode is insufficient for publication of new taxonomic designations. New designations should be based on one or more de novo trees, an example of which can be produced by Tk in de novo mode.
  [2024-02-29 15:58:14] INFO: Done.
  [2024-02-29 15:58:14] INFO: Removing intermediate files.
  [2024-02-29 15:58:14] INFO: Intermediate files removed.
  [2024-02-29 15:58:14] INFO: Done.
  mv: can't rename 'identify/*': No such file or directory

Work dir:
  /home/christian/work/28/383e4595b0622839718a3d01c47ca1

Tip: when you have fixed the problem you can continue the execution adding the option `-resume` to the run command line

 -- Check '.nextflow.log' file for details

Relevant files

nextflow.log

System information

Nextflow version: 23.10.1.5891 Hardware: Desktop Executor: local Container engine: Docker OS: Ubuntu 22.04.4 LTS Version of nf-core/mag: 2.5.4

jfy133 commented 4 months ago

Ah good intuition, the mash support was added recently to the module/pipeline and so hasn't been tested much.

Both @maxibor (who iirc updated the module) and myself are traveling but will look next week when back :)

jfy133 commented 3 months ago

OK I've semi-replicated but with mv: can't rename 'align/*': No such file or directory in my case! So regardless that module appearsto be flakey in a variety of ways

maxibor commented 3 months ago

Sounds like a find or a if [[ -f something ]]; then would do the trick

jfy133 commented 1 month ago

Done in the previous PRs listed aboev :)