epi2me-labs / wf-16s

Other
22 stars 5 forks source link

discrepancy between counts in html report and in kraken2.assignments.tsv #27

Closed ramiroricardo closed 2 months ago

ramiroricardo commented 3 months ago

Operating System

Ubuntu 22.04

Other Linux

No response

Workflow Version

v1.2.0

Workflow Execution

EPI2ME Desktop (Local)

Other workflow execution

No response

EPI2ME Version

5.1.14

CLI command run

No response

Workflow Execution - CLI Execution Profile

None

What happened?

I ran wf-16s with Kraken2 and when checking the file kraken2.assignments.tsv, I noted that the number of reads assigned to a particular genus was much lower (from >9000 to ~700) than I see in the lineages/sunburst plots. I am not sure this is a bug, just trying to understand what is going on?

Relevant log output

N E X T F L O W  ~  version 23.04.2
Launching `/home/alvigen/epi2melabs/workflows/epi2me-labs/wf-16s/main.nf` [reverent_wescoff] DSL2 - revision: b7e1c64450
WARN: NEXTFLOW RECURSION IS A PREVIEW FEATURE - SYNTAX AND FUNCTIONALITY CAN CHANGE IN FUTURE RELEASES
||||||||||   _____ ____ ___ ____  __  __ _____      _       _
||||||||||  | ____|  _ \_ _|___ \|  \/  | ____|    | | __ _| |__  ___
|||||       |  _| | |_) | |  __) | |\/| |  _| _____| |/ _` | '_ \/ __|
|||||       | |___|  __/| | / __/| |  | | |__|_____| | (_| | |_) \__ \
||||||||||  |_____|_|  |___|_____|_|  |_|_____|    |_|\__,_|_.__/|___/
||||||||||  wf-16s v1.2.0
--------------------------------------------------------------------------------
Core Nextflow options
  runName                    : reverent_wescoff
  containerEngine            : docker
  launchDir                  : /home/alvigen/epi2melabs/instances/wf-16s_01J4VZBN095H5TV5D2QWA2RRV4
  workDir                    : /home/alvigen/epi2melabs/instances/wf-16s_01J4VZBN095H5TV5D2QWA2RRV4/work
  projectDir                 : /home/alvigen/epi2melabs/workflows/epi2me-labs/wf-16s
  userName                   : alvigen
  profile                    : standard
  configFiles                : /home/alvigen/epi2melabs/workflows/epi2me-labs/wf-16s/nextflow.config, /home/alvigen/epi2melabs/instances/wf-16s_01J4VZBN095H5TV5D2QWA2RRV4/local.config
Input Options
  fastq                      : /var/lib/minknow/data/TestRun_MarcoCruz_service/no_sample_id/20240801_1826_MN42452_FAV04379_da054d2a/fastq_pass
  classifier                 : kraken2
Real Time Analysis Options
  server_threads             : 8
Reference Options
  database_set               : SILVA_138_1
  store_dir                  : /home/alvigen/epi2melabs/data
  database_sets              : [ncbi_16s_18s:[reference:https://ont-exd-int-s3-euwst1-epi2me-labs.s3.amazonaws.com/wf-metagenomics/ncbi_16s_18s/ncbi_targeted_loci_16s_18s.fna, database:https://ont-exd-int-s3-euwst1-epi2me-labs.s3.amazonaws.com/wf-metagenomics/ncbi_16s_18s/ncbi_targeted_loci_kraken2.tar.gz, ref2taxid:https://ont-exd-int-s3-euwst1-epi2me-labs.s3.amazonaws.com/wf-metagenomics/ncbi_16s_18s/ref2taxid.targloci.tsv, taxonomy:https://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump_archive/taxdmp_2023-01-01.zip], ncbi_16s_18s_28s_ITS:[reference:https://ont-exd-int-s3-euwst1-epi2me-labs.s3.amazonaws.com/wf-metagenomics/ncbi_16s_18s_28s_ITS/ncbi_16s_18s_28s_ITS.fna, database:https://ont-exd-int-s3-euwst1-epi2me-labs.s3.amazonaws.com/wf-metagenomics/ncbi_16s_18s_28s_ITS/ncbi_16s_18s_28s_ITS_kraken2.tar.gz, ref2taxid:https://ont-exd-int-s3-euwst1-epi2me-labs.s3.amazonaws.com/wf-metagenomics/ncbi_16s_18s_28s_ITS/ref2taxid.ncbi_16s_18s_28s_ITS.tsv, taxonomy:https://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump_archive/taxdmp_2023-01-01.zip], SILVA_138_1:[database:null]]
Kraken2 Options
  include_kraken2_assignments: true
Output Options
  out_dir                    : /home/alvigen/epi2melabs/instances/wf-16s_01J4VZBN095H5TV5D2QWA2RRV4/output
Advanced Options
  min_len                    : 1200
  max_len                    : 1800
!! Only displaying parameters that differ from the pipeline defaults !!
--------------------------------------------------------------------------------
If you use epi2me-labs/wf-16s for your analysis please cite:
* The nf-core framework
  https://doi.org/10.1038/s41587-020-0439-x
--------------------------------------------------------------------------------
This is epi2me-labs/wf-16s v1.2.0.
--------------------------------------------------------------------------------
Checking inputs.
Searching input for [.fastq, .fastq.gz, .fq, .fq.gz] files.
Note: Empty files or those files whose reads have been discarded after filtering based on read length and/or read quality will not appear in the report and will be excluded from subsequent analysis.
Kraken2 pipeline.
Preparing databases.
Note: SILVA TaxIDs do not match NCBI TaxIDs
Note: The database will be created from original files, which may make the wf run slower.
[skipping] Stored process > prepare_databases:prepareSILVA
[75/45ab48] Submitted process > fastcat (2)
[c4/99ab3b] Submitted process > fastcat (9)
[88/a04d62] Submitted process > fastcat (4)
[ab/f0b10f] Submitted process > kraken_pipeline:run_common:getParams
[a2/f681c6] Submitted process > kraken_pipeline:run_common:getVersions
[df/3f3b1c] Submitted process > prepare_databases:determine_bracken_length
WARN: Found empty file for sample 'barcode12'.
[3e/f20984] Submitted process > fastcat (7)
[35/311970] Submitted process > fastcat (8)
[07/1f8878] Submitted process > fastcat (10)
[aa/fffaca] Submitted process > kraken_pipeline:output_results (1)
[83/ad943f] Submitted process > fastcat (6)
[b6/9a3c57] Submitted process > fastcat (1)
[e8/c721d4] Submitted process > fastcat (5)
[19/4ce78f] Submitted process > fastcat (12)
[c9/89ed98] Submitted process > fastcat (11)
[b7/a24c0f] Submitted process > fastcat (3)
[33/b9246a] Submitted process > kraken_pipeline:run_kraken2 (barcode01)
[79/d5b634] Submitted process > kraken_pipeline:run_kraken2 (barcode03)
[79/714bbe] Submitted process > kraken_pipeline:run_kraken2 (barcode06)
[08/a63729] Submitted process > kraken_pipeline:output_results (2)
[cc/7a6102] Submitted process > kraken_pipeline:run_bracken (barcode01)
[15/cd0e2a] Submitted process > kraken_pipeline:run_kraken2 (barcode07)
[d2/45c8cf] Submitted process > kraken_pipeline:run_bracken (barcode03)
[cf/b9f4e0] Submitted process > kraken_pipeline:run_kraken2 (barcode08)
[14/9dd101] Submitted process > kraken_pipeline:run_kraken2 (barcode10)
[5f/9c1b5d] Submitted process > kraken_pipeline:run_bracken (barcode06)
[a5/eb349d] Submitted process > kraken_pipeline:run_kraken2 (barcode11)
[e8/87ed3d] Submitted process > kraken_pipeline:run_bracken (barcode07)
[8c/04852b] Submitted process > kraken_pipeline:run_kraken2 (barcode05)
[95/0e3d29] Submitted process > kraken_pipeline:run_kraken2 (barcode09)
[33/f681ae] Submitted process > kraken_pipeline:run_kraken2 (barcode04)
[85/fb0b33] Submitted process > kraken_pipeline:run_bracken (barcode08)
[5d/7f813b] Submitted process > kraken_pipeline:run_kraken2 (barcode02)
[7f/75d259] Submitted process > kraken_pipeline:run_bracken (barcode10)
[69/9196d6] Submitted process > kraken_pipeline:run_bracken (barcode11)
[c9/1cceeb] Submitted process > kraken_pipeline:run_bracken (barcode05)
[e2/f23162] Submitted process > kraken_pipeline:run_bracken (barcode09)
[9a/d970ac] Submitted process > kraken_pipeline:run_bracken (barcode04)
[a1/86c7eb] Submitted process > kraken_pipeline:run_bracken (barcode02)
[8e/0d1982] Submitted process > kraken_pipeline:output_kraken2_read_assignments (barcode02)
[18/d4a3b8] Submitted process > kraken_pipeline:output_kraken2_read_assignments (barcode05)
[96/30f173] Submitted process > kraken_pipeline:output_kraken2_read_assignments (barcode10)
[72/b58a7c] Submitted process > kraken_pipeline:output_kraken2_read_assignments (barcode06)
[65/756383] Submitted process > kraken_pipeline:output_kraken2_read_assignments (barcode03)
[c0/e56aa4] Submitted process > kraken_pipeline:output_kraken2_read_assignments (barcode11)
[1c/07860c] Submitted process > kraken_pipeline:output_kraken2_read_assignments (barcode01)
[60/67c170] Submitted process > kraken_pipeline:output_kraken2_read_assignments (barcode08)
[55/a66336] Submitted process > kraken_pipeline:output_kraken2_read_assignments (barcode09)
[7b/c06e3f] Submitted process > kraken_pipeline:output_kraken2_read_assignments (barcode07)
[30/c58858] Submitted process > kraken_pipeline:output_kraken2_read_assignments (barcode04)
[04/ba64df] Submitted process > kraken_pipeline:output_results (3)
[1f/58a022] Submitted process > kraken_pipeline:output_results (4)
[e6/0822d2] Submitted process > kraken_pipeline:output_results (5)
[01/db6a0f] Submitted process > kraken_pipeline:output_results (6)
[16/cd9ae9] Submitted process > kraken_pipeline:createAbundanceTables
[9c/b027c2] Submitted process > kraken_pipeline:output_results (7)
[88/571993] Submitted process > kraken_pipeline:output_results (8)
[ae/77b48e] Submitted process > kraken_pipeline:output_results (9)
[8b/b98454] Submitted process > kraken_pipeline:output_results (10)
[dd/12a572] Submitted process > kraken_pipeline:output_results (11)
[bd/abad9b] Submitted process > kraken_pipeline:output_results (12)
[43/95f9a5] Submitted process > kraken_pipeline:output_results (13)
[28/f476d1] Submitted process > kraken_pipeline:output_results (14)
[43/5c667d] Submitted process > kraken_pipeline:makeReport (1)
[8f/6ce8b0] Submitted process > kraken_pipeline:output_results (15)

Application activity log entry

No response

Were you able to successfully run the latest version of the workflow with the demo data?

other (please describe below)

Other demo data information

did no try
nggvs commented 3 months ago

Hi @ramiroricardo ,

Thank you for using the workflow! Abundances from kraken are then refined by bracken: https://github.com/jenniferlu717/Bracken. But, in any case, could you share the abundance table and the kraken/bracken reports from the output directory?

Thank you very much in advance!