nf-core / ampliseq

Amplicon sequencing analysis workflow using DADA2 and QIIME2
https://nf-co.re/ampliseq
MIT License
166 stars 108 forks source link

Add demultiplexing step #64

Open DiegoBrambilla opened 5 years ago

DiegoBrambilla commented 5 years ago

Hi, A very helpful feature to add would be the demultiplexing of the reads as an optional step. This function has already been developed on QIIME2 and, as such, it should be possible to add it to rrna-ampliseq pipeline.

d4straub commented 5 years ago

This might be a helpful feature. As far as I know there is work ongoing for wrapping DADA2 directely in this pipeline instead of QIIME2 using DADA2. Therefore I am unsure how to integrate this feature sustainably with the major changes that are planned to the early workflow. However, PRs are welcome.

d4straub commented 4 years ago

@DiegoBrambilla is planning to implement dada2 for PacBio analysis and could immediately add that demultiplexing step :)

DiegoBrambilla commented 4 years ago

We take it into consideration. For the time being, implementing the R-DADA2 pipeline, taxonomy annotation from several sources and dealing with PacBio reads take priority.

d4straub commented 2 years ago

Demultiplexing could be done via cutadapt as documented here. I never come across the need for demultiplexing in the pipeline, but if anyone does, please mention it here and I might further look into it.

a4000 commented 12 months ago

I want to add demultiplexing (with Cutadapt) to Ampliseq. The way I've handled demultiplexing in my own nf-core style pipeline is to ask the user to specify the path to their raw data in the command line --raw_data "/path/to/data/*{R1,R2}*.fastq.gz". Then in the sample sheet the user has to add the columns fw_index, rv_index, fw_primer, and rvprimer (the two rv columns can be empty for single-end data). I use the _index columns for demultiplexing and the _primer columns for trimming after demultiplexing. The main issue I see is that Ampliseq doesn't require a sample sheet as input, so I'm wondering if anyone has a suggestion for a better way of adding this feature to Ampliseq? Maybe the sample sheet should be required if the user wants to demultiplex?

d4straub commented 12 months ago

What about adding a few optional columns (such as fw_index, rv_index) to the sample sheet. If those columns are present, demultiplexing will run. If that might mess too much with existing routines, a separate input file (e.g. --demultiplex "sheet.tsv") that contains the necessary information (samplesheet & demultiplexsheet have identical IDs) might be an option? While ampliseq does not require a samplesheet (folder input & fasta input are also allowed), for demultiplexing that would be fine. After all, a samplesheet can handle more info than a folder input. Not all input options need to support all functionality, imho.

erikrikarddaniel commented 12 months ago

To me, adding columns to the sample sheet sounds best.

NoMeatNo commented 2 months ago

Hi there,

I’m curious if it’s now possible to utilize AmpliSeq with the combinatorial dual indexing system and perform demultiplexing directly in the pipeline as part of the AmpliSeq workflow. Could someone please clarify? Thanks!

a4000 commented 2 months ago

@NoMeatNo unfortunately no. That's not a part of Ampliseq yet.

NoMeatNo commented 2 months ago

@NoMeatNo unfortunately no. That's not a part of Ampliseq yet.

Oh, I see. Thanks @a4000 for the quick response.

In the meantime, what’s the best strategy to follow? Would using Cutadapt and then Ampliseq be effective? How about q2-demux?

Earlier, you mentioned a method for demultiplexing in your own nf-core style pipeline, which involved specifying the path to raw data and using specific columns in the sample sheet. Could you provide more details on how you managed it? I’d appreciate any additional information you can share

a4000 commented 2 months ago

@NoMeatNo I haven't tried q2-demux, but I do recommend following Cutadapt's documentation on demultiplexing here. Using Cutadapt then Ampliseq should be effective.

d4straub commented 2 months ago

You could also check out https://nf-co.re/demultiplex (that I have never used) to apply first and then use ampliseq. If you do, let us know if that works as expected. Just dont do primer trimming or any quality filtering!