epi2me-labs / wf-metagenomics

Metagenomic classification of long-read sequencing data
Other
62 stars 23 forks source link

no such variable : ref2taxid_file #91

Closed Aline-Git closed 5 months ago

Aline-Git commented 7 months ago

Operating System

Other Linux (please specify below)

Other Linux

ubuntu 18.04

Workflow Version

v2.9.3-g6636bc9

Workflow Execution

Command line (Cluster)

Other workflow execution

On a virtual machine via comand line

EPI2ME Version

No response

CLI command run

nextflow run epi2me-labs/wf-metagenomics \ --fastq /data/VITAE/WP3_20240205/input_folder \ --sample_sheet /data/VITAE/WP3_20240205/sample_sheet.csv \ --classifier minimap2 \ --reference /data/reference/database/PR2_db/pr2_version_5_0_0_SSU_dada2.mmi \ --ref2taxid /data/reference/database/PR2_db/ref2taxid_PR2_rapide.tsv

Workflow Execution - CLI Execution Profile

standard (default)

What happened?

Hello, Thanks for this workflow, I tested it with the default parameters and it worked fine.

I would like to use the workflow with the PR2 database (18S). First I wanted to use the kraken2 option, but building the custom database for kraken2 seems a bit complicated since with the nodes.dmp file to provide.

The database is a fasta file with this format :

taxonomy1 CGCTTAAACTA... taxonomy2 CSCTAATTCTA... etc. where taxonomyn looks like this : Eukaryota;Obazoa;Opisthokonta;Fungi;Ascomycota;Pezizomycotina;Eurotiomycetes;Knufia;Knufia_epidermidis;

I tried the minimap2 option. I first did not provide a ref2taxid file, but it raised an error. I created an artificial ref2taxid file with this format :

taxonomy1\ttaxonomy1 taxonomy2\ttaxonomy2 etc.

But now I have this error

Relevant log output

Checking inputs.
Note: Reference/Database are custom.
Note: Memory available to the workflow must be slightly higher than size of the database custom index.
Note: Memory available to the workflow must be slightly higher than size of the database Standard-8 index (8GB) or consider to use --kraken2_memory_mapping
Searching input for [.fastq, .fastq.gz, .fq, .fq.gz] files.
[-        ] process > validate_sample_sheet                      -
[-        ] process > fastcat                                    -
[-        ] process > prepare_databases:download_unpack_taxonomy -
Note: Empty files or those files whose reads have been discarded after filtering based on read length and/or read quality will not appear in the report and will be excluded from subsequent analysis.
Minimap2 pipeline.
Preparing databases.
Using default taxonomy database.
Checking custom reference exists
ERROR ~ No such variable: ref2taxid_file

 -- Check script '/home/cluster/.nextflow/assets/epi2me-labs/wf-metagenomics/./modules/local/databases.nf' at line: 372 or see '.nextflow.log' file for more details

Application activity log entry

[main] DEBUG nextflow.cli.Launcher - $> nextflow run epi2me-labs/wf-metagenomics --fastq /data/VITAE/WP3_20240205/input_folde
r --sample_sheet /data/VITAE/WP3_20240205/sample_sheet.csv --classifier minimap2 --reference /data/reference/database/PR2_db/pr2_version_5_0_0_SS
U_dada2.mmi --ref2taxid /data/reference/database/PR2_db/ref2taxid_PR2_rapide.tsv
Apr-05 18:45:03.935 [main] INFO  nextflow.cli.CmdRun - N E X T F L O W  ~  version 23.10.1
Apr-05 18:45:03.965 [main] DEBUG nextflow.plugin.PluginsFacade - Setting up plugin manager > mode=prod; embedded=false; plugins-dir=/home/cluster
/.nextflow/plugins; core-plugins: nf-amazon@2.1.4,nf-azure@1.3.3,nf-cloudcache@0.3.0,nf-codecommit@0.1.5,nf-console@1.0.6,nf-ga4gh@1.1.0,nf-googl
e@1.8.3,nf-tower@1.6.3,nf-wave@1.0.1
Apr-05 18:45:03.980 [main] INFO  o.pf4j.DefaultPluginStatusProvider - Enabled plugins: []
Apr-05 18:45:03.982 [main] INFO  o.pf4j.DefaultPluginStatusProvider - Disabled plugins: []
Apr-05 18:45:03.986 [main] INFO  org.pf4j.DefaultPluginManager - PF4J version 3.4.1 in 'deployment' mode
Apr-05 18:45:04.000 [main] INFO  org.pf4j.AbstractPluginManager - No plugins
Apr-05 18:45:04.021 [main] DEBUG nextflow.scm.ProviderConfig - Using SCM config path: /home/cluster/.nextflow/scm
Apr-05 18:45:05.357 [main] DEBUG nextflow.scm.AssetManager - Git config: /home/cluster/.nextflow/assets/epi2me-labs/wf-metagenomics/.git/config;
branch: master; remote: origin; url: https://github.com/epi2me-labs/wf-metagenomics.git
Apr-05 18:45:05.389 [main] DEBUG nextflow.scm.RepositoryFactory - Found Git repository result: [RepositoryFactory]
Apr-05 18:45:05.404 [main] DEBUG nextflow.scm.AssetManager - Git config: /home/cluster/.nextflow/assets/epi2me-labs/wf-metagenomics/.git/config;
branch: master; remote: origin; url: https://github.com/epi2me-labs/wf-metagenomics.git
Apr-05 18:45:25.880 [main] DEBUG nextflow.scm.AssetManager - WARN: Failed to check remote Git revision
org.eclipse.jgit.api.errors.TransportException: https://github.com/epi2me-labs/wf-metagenomics.git: cannot open git-upload-pack
        at org.eclipse.jgit.api.LsRemoteCommand.execute(LsRemoteCommand.java:192)
        at org.eclipse.jgit.api.LsRemoteCommand.call(LsRemoteCommand.java:131)
        at nextflow.scm.AssetManager.getRemoteCommitId(AssetManager.groovy:1003)
        at nextflow.scm.AssetManager.checkRemoteStatus0(AssetManager.groovy:1014)
        at nextflow.scm.AssetManager.checkRemoteStatus(AssetManager.groovy:1032)
        at nextflow.cli.CmdRun.getScriptFile0(CmdRun.groovy:540)
        at nextflow.cli.CmdRun.getScriptFile(CmdRun.groovy:462)
        at nextflow.cli.CmdRun.run(CmdRun.groovy:317)
        at nextflow.cli.Launcher.run(Launcher.groovy:500)
        at nextflow.cli.Launcher.main(Launcher.groovy:672)
Caused by: org.eclipse.jgit.errors.TransportException: https://github.com/epi2me-labs/wf-metagenomics.git: cannot open git-upload-pack
        at org.eclipse.jgit.transport.TransportHttp.connect(TransportHttp.java:749)
        at org.eclipse.jgit.transport.TransportHttp.openFetch(TransportHttp.java:465)
        at org.eclipse.jgit.api.LsRemoteCommand.execute(LsRemoteCommand.java:170)
        ... 9 common frames omitted
Caused by: java.net.UnknownHostException: github.com
        at java.base/java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:229)
        at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.base/java.net.Socket.connect(Socket.java:609)
        at java.base/sun.security.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:305)
        at java.base/sun.net.NetworkClient.doConnect(NetworkClient.java:177)
        at java.base/sun.net.www.http.HttpClient.openServer(HttpClient.java:507)
        at java.base/sun.net.www.http.HttpClient.openServer(HttpClient.java:602)
        at java.base/sun.net.www.protocol.https.HttpsClient.<init>(HttpsClient.java:266)
        at java.base/sun.net.www.protocol.https.HttpsClient.New(HttpsClient.java:373)
        at java.base/sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.getNewHttpClient(AbstractDelegateHttpsURLConnection.java:207)
        at java.base/sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1232)
        at java.base/sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1081)
        at java.base/sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:193)
        at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1592)
        at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1520)
        at java.base/java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:527)
        at java.base/sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:334)
        at org.eclipse.jgit.transport.http.JDKHttpConnection.getResponseCode(JDKHttpConnection.java:85)
        at org.eclipse.jgit.util.HttpSupport.response(HttpSupport.java:232)
        at org.eclipse.jgit.transport.TransportHttp.connect(TransportHttp.java:654)
        ... 11 common frames omitted
Apr-05 18:45:25.907 [main] DEBUG nextflow.config.ConfigBuilder - Found config base: /home/cluster/.nextflow/assets/epi2me-labs/wf-metagenomics/ne
xtflow.config
Apr-05 18:45:25.908 [main] DEBUG nextflow.config.ConfigBuilder - Parsing config file: /home/cluster/.nextflow/assets/epi2me-labs/wf-metagenomics/
nextflow.config
Apr-05 18:45:25.926 [main] DEBUG nextflow.config.ConfigBuilder - Applying config profile: `standard`
Apr-05 18:45:26.277 [main] DEBUG nextflow.cli.CmdRun - Applied DSL=2 from script declararion
Apr-05 18:45:26.278 [main] INFO  nextflow.cli.CmdRun - Launching `https://github.com/epi2me-labs/wf-metagenomics` [sleepy_minsky] DSL2 - revision
: 6636bc9044 [master]
Apr-05 18:45:26.278 [main] DEBUG nextflow.plugin.PluginsFacade - Plugins default=[]
Apr-05 18:45:26.279 [main] DEBUG nextflow.plugin.PluginsFacade - Plugins resolved requirement=[]
Apr-05 18:45:26.284 [main] DEBUG n.secret.LocalSecretsProvider - Secrets store: /home/cluster/.nextflow/secrets/store.json
Apr-05 18:45:26.289 [main] DEBUG nextflow.secret.SecretsLoader - Discovered secrets providers: [nextflow.secret.LocalSecretsProvider@604d23fa] -
activable => nextflow.secret.LocalSecretsProvider@604d23fa
Apr-05 18:45:26.411 [main] DEBUG nextflow.Session - Session UUID: 79c630fa-ed92-40c1-bd25-e9e90408e844
Apr-05 18:45:26.411 [main] DEBUG nextflow.Session - Run name: sleepy_minsky
Apr-05 18:45:26.412 [main] DEBUG nextflow.Session - Executor pool size: 12
Apr-05 18:45:26.425 [main] DEBUG nextflow.file.FilePorter - File porter settings maxRetries=3; maxTransfers=50; pollTimeout=null
Apr-05 18:45:26.433 [main] DEBUG nextflow.util.ThreadPoolBuilder - Creating thread pool 'FileTransfer' minSize=10; maxSize=36; workQueue=LinkedBl
ockingQueue[10000]; allowCoreThreadTimeout=false
Apr-05 18:45:36.484 [main] DEBUG nextflow.cli.CmdRun -
  Version: 23.10.1 build 5891
  Created: 12-01-2024 22:01 UTC (23:01 CEST)
  System: Linux 5.4.0-139-generic
  Runtime: Groovy 3.0.19 on OpenJDK 64-Bit Server VM 11.0.18+10-post-Ubuntu-0ubuntu118.04.1
  Encoding: UTF-8 (UTF-8)
  Process: 24182@host-10-61-66-8 [10.61.66.8]
  CPUs: 12 - Mem: 31.4 GB (12.1 GB) - Swap: 0 (0)
Apr-05 18:45:36.516 [main] DEBUG nextflow.Session - Work-dir: /data/reference/database/work [ext2/ext3]
Apr-05 18:45:36.540 [main] DEBUG nextflow.executor.ExecutorFactory - Extension executors providers=[]
Apr-05 18:45:36.557 [main] DEBUG nextflow.Session - Observer factory: DefaultObserverFactory
Apr-05 18:45:36.651 [main] DEBUG nextflow.cache.CacheFactory - Using Nextflow cache factory: nextflow.cache.DefaultCacheFactory
Apr-05 18:45:36.668 [main] DEBUG nextflow.util.CustomThreadPool - Creating default thread pool > poolSize: 13; maxThreads: 1000
Apr-05 18:45:36.771 [main] DEBUG nextflow.Session - Session start
Apr-05 18:45:36.778 [main] DEBUG nextflow.trace.TraceFileObserver - Workflow started -- trace file: /data/reference/database/output/execution/tra
ce.txt
Apr-05 18:45:36.792 [main] DEBUG nextflow.Session - Using default localLib path: /home/cluster/.nextflow/assets/epi2me-labs/wf-metagenomics/lib
Apr-05 18:45:36.797 [main] DEBUG nextflow.Session - Adding to the classpath library: /home/cluster/.nextflow/assets/epi2me-labs/wf-metagenomics/l
ib
Apr-05 18:45:36.798 [main] DEBUG nextflow.Session - Adding to the classpath library: /home/cluster/.nextflow/assets/epi2me-labs/wf-metagenomics/l
ib/nfcore_external_java_deps.jar
Apr-05 18:45:38.612 [main] DEBUG nextflow.script.ScriptRunner - > Launching execution
Apr-05 18:45:44.092 [main] WARN  nextflow.NextflowMeta$Preview - NEXTFLOW RECURSION IS A PREVIEW FEATURE - SYNTAX AND FUNCTIONALITY CAN CHANGE IN
 FUTURE RELEASES
Apr-05 18:45:44.270 [main] INFO  nextflow.Nextflow -
||||||||||   _____ ____ ___ ____  __  __ _____      _       _
||||||||||  | ____|  _ \_ _|___ \|  \/  | ____|    | | __ _| |__  ___
|||||       |  _| | |_) | |  __) | |\/| |  _| _____| |/ _` | '_ \/ __|
|||||       | |___|  __/| | / __/| |  | | |__|_____| | (_| | |_) \__ \
||||||||||  |_____|_|  |___|_____|_|  |_|_____|    |_|\__,_|_.__/|___/
||||||||||  wf-metagenomics v2.9.3-g6636bc9
--------------------------------------------------------------------------------
Core Nextflow options
  revision       : master
  runName        : sleepy_minsky
  containerEngine: docker
  container      : [withLabel:wfmetagenomics:ontresearch/wf-metagenomics:sha44a6dacff5f2001d917b774647bb4cbc1b53bc76, withLabel:wf_co
mmon:ontresearch/wf-common:sha645176f98b8780851f9c476a064d44c2ae76ddf6, withLabel:amr:ontresearch/abricate:sha2c763f19fac46035437854f1e2a5f055535
42a78]
  launchDir      : /data/reference/database
  workDir        : /data/reference/database/work
  projectDir     : /home/cluster/.nextflow/assets/epi2me-labs/wf-metagenomics
  userName       : cluster
  profile        : standard
  configFiles    : /home/cluster/.nextflow/assets/epi2me-labs/wf-metagenomics/nextflow.config

Input Options
  fastq          : /data/VITAE/WP3_20240205/input_folder
  classifier     : minimap2

Sample Options
  sample_sheet   : /data/VITAE/WP3_20240205/sample_sheet.csv

Reference Options
  reference      : /data/reference/database/PR2_db/pr2_version_5_0_0_SSU_dada2.mmi
  ref2taxid      : /data/reference/database/PR2_db/ref2taxid_PR2_rapide.tsv
  database_sets  : [ncbi_16s_18s:[reference:https://ont-exd-int-s3-euwst1-epi2me-labs.s3.amazonaws.com/wf-metagenomics/ncbi_16s_18s/n
cbi_targeted_loci_16s_18s.fna, database:https://ont-exd-int-s3-euwst1-epi2me-labs.s3.amazonaws.com/wf-metagenomics/ncbi_16s_18s/ncbi_targeted_loc
i_kraken2.tar.gz, ref2taxid:https://ont-exd-int-s3-euwst1-epi2me-labs.s3.amazonaws.com/wf-metagenomics/ncbi_16s_18s/ref2taxid.targloci.tsv, taxon
omy:https://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump_archive/taxdmp_2023-01-01.zip], ncbi_16s_18s_28s_ITS:[reference:https://ont-exd-int-s3-euws
t1-epi2me-labs.s3.amazonaws.com/wf-metagenomics/ncbi_16s_18s_28s_ITS/ncbi_16s_18s_28s_ITS.fna, database:https://ont-exd-int-s3-euwst1-epi2me-labs
.s3.amazonaws.com/wf-metagenomics/ncbi_16s_18s_28s_ITS/ncbi_16s_18s_28s_ITS_kraken2.tar.gz, ref2taxid:https://ont-exd-int-s3-euwst1-epi2me-labs.s
3.amazonaws.com/wf-metagenomics/ncbi_16s_18s_28s_ITS/ref2taxid.ncbi_16s_18s_28s_ITS.tsv, taxonomy:https://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdu
mp_archive/taxdmp_2023-01-01.zip], SILVA_138_1:[database:null], Standard-8:[database:https://genome-idx.s3.amazonaws.com/kraken/k2_standard_08gb_
20231009.tar.gz, taxonomy:https://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump_archive/new_taxdump_2023-03-01.zip], PlusPF-8:[database:https://genom
e-idx.s3.amazonaws.com/kraken/k2_pluspf_08gb_20230314.tar.gz, taxonomy:https://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump_archive/new_taxdump_2023
-03-01.zip], PlusPFP-8:[database:https://genome-idx.s3.amazonaws.com/kraken/k2_pluspfp_08gb_20230314.tar.gz, taxonomy:https://ftp.ncbi.nlm.nih.go
v/pub/taxonomy/taxdump_archive/new_taxdump_2023-03-01.zip]]

!! Only displaying parameters that differ from the pipeline defaults !!
--------------------------------------------------------------------------------
If you use epi2me-labs/wf-metagenomics for your analysis please cite:

* The nf-core framework
  https://doi.org/10.1038/s41587-020-0439-x

--------------------------------------------------------------------------------
This is epi2me-labs/wf-metagenomics v2.9.3-g6636bc9.
--------------------------------------------------------------------------------
Apr-05 18:46:14.352 [main] INFO  nextflow.Nextflow - Checking inputs.
Apr-05 18:46:14.354 [main] INFO  nextflow.Nextflow - Note: Reference/Database are custom.
Apr-05 18:46:14.354 [main] INFO  nextflow.Nextflow - Note: Memory available to the workflow must be slightly higher than size of the database cus
tom index.
Apr-05 18:46:14.355 [main] INFO  nextflow.Nextflow - Note: Memory available to the workflow must be slightly higher than size of the database Sta
ndard-8 index (8GB) or consider to use --kraken2_memory_mapping
Apr-05 18:46:14.367 [main] INFO  nextflow.Nextflow - Searching input for [.fastq, .fastq.gz, .fq, .fq.gz] files.
Apr-05 18:46:14.528 [main] DEBUG nextflow.script.ProcessConfig - Config settings `withLabel:wf_common` matches labels `ingress,wf_common` for pro
cess with name validate_sample_sheet
Apr-05 18:46:14.548 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: null
Apr-05 18:46:14.548 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local'
Apr-05 18:46:14.555 [main] DEBUG nextflow.executor.Executor - [warm up] executor > local
Apr-05 18:46:14.562 [main] DEBUG n.processor.LocalPollingMonitor - Creating local task monitor for executor 'local' > cpus=12; memory=31.4 GB; ca
pacity=12; pollInterval=100ms; dumpInterval=5m
Apr-05 18:46:14.565 [main] DEBUG n.processor.TaskPollingMonitor - >>> barrier register (monitor: local)
Apr-05 18:46:14.766 [main] DEBUG nextflow.script.ProcessConfig - Config settings `withLabel:wf_common` matches labels `ingress,wf_common` for pro
cess with name fastcat
Apr-05 18:46:14.767 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: null
Apr-05 18:46:14.767 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local'
Apr-05 18:46:14.780 [main] INFO  nextflow.Nextflow - Note: Empty files or those files whose reads have been discarded after filtering based on re
ad length and/or read quality will not appear in the report and will be excluded from subsequent analysis.
Apr-05 18:46:14.784 [main] INFO  nextflow.Nextflow - Minimap2 pipeline.
Apr-05 18:46:14.785 [main] INFO  nextflow.Nextflow - Preparing databases.
Apr-05 18:46:14.785 [main] INFO  nextflow.Nextflow - Using default taxonomy database.
Apr-05 18:46:14.794 [main] DEBUG nextflow.script.ProcessConfig - Config settings `withLabel:wfmetagenomics` matches labels `wfmetagenomics` for p
rocess with name prepare_databases:download_unpack_taxonomy
Apr-05 18:46:14.795 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: null
Apr-05 18:46:14.795 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'local'
Apr-05 18:46:14.798 [main] INFO  nextflow.Nextflow - Checking custom reference exists
Apr-05 18:46:14.801 [main] DEBUG nextflow.script.ScriptRunner - Parsed script files:
  Script_f6bc47f22a291b45: /home/cluster/.nextflow/assets/epi2me-labs/wf-metagenomics/./subworkflows/../modules/local/amr.nf
  Script_f6523ca3a05960be: /home/cluster/.nextflow/assets/epi2me-labs/wf-metagenomics/./lib/ingress.nf
  Script_acc474d2620316c6: /home/cluster/.nextflow/assets/epi2me-labs/wf-metagenomics/./subworkflows/minimap_pipeline.nf
  Script_3b8efe5fa3271859: /home/cluster/.nextflow/assets/epi2me-labs/wf-metagenomics/./subworkflows/real_time_pipeline.nf
  Script_395febca8bfa0dbb: /home/cluster/.nextflow/assets/epi2me-labs/wf-metagenomics/./modules/local/databases.nf
  Script_a0145f9525620268: /home/cluster/.nextflow/assets/epi2me-labs/wf-metagenomics/./subworkflows/kraken_pipeline.nf
  Script_c9e0cbf8c1ac304f: /home/cluster/.nextflow/assets/epi2me-labs/wf-metagenomics/./subworkflows/../modules/local/common.nf
  Script_fdaaff176d03b24a: /home/cluster/.nextflow/assets/epi2me-labs/wf-metagenomics/main.nf
Apr-05 18:46:14.802 [main] DEBUG nextflow.Session - Session aborted -- Cause: No such property: ref2taxid_file for class: nextflow.script.Workflo
wBinding
Apr-05 18:46:14.823 [main] DEBUG nextflow.Session - The following nodes are still active:
  [operator] map
  [operator] concat
  [operator] last
  [operator] splitCsv
  [operator] map
  [operator] map
  [operator] join
  [operator] map
  [operator] map
  [operator] branch
  [operator] mix
  [operator] map
  [operator] mix
  [operator] map
  [operator] subscribe
  [operator] map
  [operator] filter

Apr-05 18:46:14.879 [Task monitor] DEBUG n.processor.TaskPollingMonitor - <<< barrier arrives (monitor: local) - terminating tasks monitor poll l
oop
Apr-05 18:46:44.874 [main] ERROR nextflow.cli.Launcher - @unknown
groovy.lang.MissingPropertyException: No such property: ref2taxid_file for class: nextflow.script.WorkflowBinding
        at groovy.lang.Binding.getVariable(Binding.java:61)
        at nextflow.script.WorkflowBinding.getVariable(WorkflowBinding.groovy:140)
        at groovy.lang.Binding.getProperty(Binding.java:116)
        at nextflow.script.WorkflowBinding.getProperty(WorkflowBinding.groovy:129)
        at org.codehaus.groovy.runtime.InvokerHelper.getProperty(InvokerHelper.java:190)
        at groovy.lang.Closure.getPropertyTryThese(Closure.java:320)
        at groovy.lang.Closure.getPropertyDelegateFirst(Closure.java:310)
        at groovy.lang.Closure.getProperty(Closure.java:296)
        at org.codehaus.groovy.runtime.callsite.PogoGetPropertySite.getProperty(PogoGetPropertySite.java:49)
        at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callGroovyObjectGetProperty(AbstractCallSite.java:341)
        at Script_395febca8bfa0dbb$_runScript_closure7$_closure26.doCall(Script_395febca8bfa0dbb:372)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:107)
        at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:323)
        at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:274)
        at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1030)
        at groovy.lang.Closure.call(Closure.java:427)
        at groovy.lang.Closure.call(Closure.java:406)
        at nextflow.script.WorkflowDef.run0(WorkflowDef.groovy:204)
        at nextflow.script.WorkflowDef.run(WorkflowDef.groovy:188)
        at nextflow.script.BindableDef.invoke_a(BindableDef.groovy:51)
        at nextflow.script.ComponentDef.invoke_o(ComponentDef.groovy:40)
        at nextflow.script.WorkflowBinding.invokeMethod(WorkflowBinding.groovy:102)
        at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeOnDelegationObjects(ClosureMetaClass.java:408)
        at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:350)
        at org.codehaus.groovy.runtime.callsite.PogoMetaClassSite.callCurrent(PogoMetaClassSite.java:61)
        at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCallCurrent(CallSiteArray.java:51)
        at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callCurrent(AbstractCallSite.java:171)
 at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callCurrent(AbstractCallSite.java:194)
        at Script_fdaaff176d03b24a$_runScript_closure1$_closure4.doCall(Script_fdaaff176d03b24a:136)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:107)
        at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:323)
        at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:274)
        at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1030)
        at groovy.lang.Closure.call(Closure.java:427)
        at groovy.lang.Closure.call(Closure.java:406)
        at nextflow.script.WorkflowDef.run0(WorkflowDef.groovy:204)
        at nextflow.script.WorkflowDef.run(WorkflowDef.groovy:188)
        at nextflow.script.BindableDef.invoke_a(BindableDef.groovy:51)
        at nextflow.script.IterableDef$invoke_a.call(Unknown Source)
        at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:47)
        at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:125)
        at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:139)
        at nextflow.script.BaseScript.run0(BaseScript.groovy:183)
        at nextflow.script.BaseScript.run(BaseScript.groovy:192)
        at nextflow.script.ScriptParser.runScript(ScriptParser.groovy:236)
        at nextflow.script.ScriptRunner.run(ScriptRunner.groovy:242)
        at nextflow.script.ScriptRunner.execute(ScriptRunner.groovy:137)
        at nextflow.cli.CmdRun.run(CmdRun.groovy:372)
        at nextflow.cli.Launcher.run(Launcher.groovy:500)
        at nextflow.cli.Launcher.main(Launcher.groovy:672)

Were you able to successfully run the latest version of the workflow with the demo data?

yes

Other demo data information

No response

nggvs commented 7 months ago

Hi @Aline-Git , Thank you for using the workflow! To use the workflow with minimap2 and a custom database you need the next files: FASTA file:

> seq1
AAAAAA
> seq2
AAAAAA

The ref2taxid file:

seq1\564117 
seq2\t1454219

where 564117 is the NCBI taxid for Marinobacter antarcticus and 1454219 is the NCBI taxid for Pseudomonas aeruginosa 059A

Alternatively you can use different taxids but in that case you also need to provide a custom taxonomy database. You can take a look here to know more about how to use custom databases. Please let me know if this helps with your problem!

nggvs commented 6 months ago

Hi @Aline-Git , were you able to run the workflow with it? If that is the case, please close the issue

nggvs commented 5 months ago

Hi @Aline-Git , Hope you were able to run the workflow. I'm going to close the issue as there are no news, but please feel free to open a new one if you find something else. Thank you for using the workflow!

Aline-Git commented 5 months ago

Thanks for your help. Indeed it works, the pipeline gets complete once this ref2taxid_file provided.

Yet, what I really would like is to keep the taxonomy of the pr2 database, which is different from the ncbi. I don't know if it will be possible.

I will try a bit by myself and open a new issue if I cannot do it.

Thank you for the workflow :) !

nggvs commented 5 months ago

Thank you! If taxids are different then you need a taxonomy database

chris-krohn commented 1 week ago

Hi there, if I understood this correctly then most of the confusion relating to mapping reads with minimap2 to a custom (non NCBI) database and taxonomy, stems from the need to have a taxonomy database. Usually, when mapping with minimap2 only two files are needed (in addition to unknown reads), the .fasta file with reference sequences and the .tsv file that contains the taxonomy string with identical sequence IDs. But in the case of this wf-metagenomic workflow we seem to need a third entity, the 'taxonomy database’, which seems to be a bunch of .dmp files in a folder. This is not well explained here I feel. If I have my own database ( .fasta + .tsv files) how do I create this needed taxdump folder and why is it even required? Thanks!