Closed koopkaup closed 5 years ago
You should be able to use a linked adapter for this purpose. Can you try this and let me know whether it works? I’ll then make it clearer in the documentation.
I will try this, but how can I provide linked adapters as a fasta file? As shown in the example: -a file:barcodes.fasta
Put them in the FASTA file like this:
>adapter1
ACGT...AACCGGTT
>adapter2
TTAAGG...CCAA
I guess that, since you use file:
, you probably have more than a few adapters. In that case, putting them into the FASTA file as I said in the comment above requires you to list all possible combinations. Depending on how many there are, this could be a bit inefficient and you may be better off running cutadapt in a "nested" way: Run it once to demultiplex according to the forward barcode, and then run it on all the output files to demultiplex according to the reverse barcode.
Sorry, I should have probably read the title more thoroughly. I skipped the fact that you have paired-end reads and the above will only work as I suggested when you merge the reads before running cutadapt.
I will consider adding an option to make this easier.
Can you clarify whether your reads look like you describe above or whether it is the DNA fragment that you were describing? (forward_barcode_sequence forward_primer_sequence read reverse_primer_sequence reverse_barcode)
My reads are in that format.
Hi, It seems i have the same kind of data. Please check my recent post on Biostars: https://www.biostars.org/p/324429/#324738
Could one of you send me some small part of your dataset (with just a couple of reads)? I would also need to know what the forward_barcode_sequence, forward_primer_sequence, reverse_primer_sequence, reverse_barcode sequences are (these aren’t random barcodes I assume). You can send this to me privately at marcel.martin@scilifelab.se, but mention it here if you have done so.
However, note that I will not have time to work on this until middle of August at the earliest.
Hi, Unfortunately my data are already demultiplexed by sequencing facility. But to my understanding the task is exactly the same (see detailed discription on Biostars). In case you still need my data please let me know. This may be useful for understanding of dual index technology: https://www.drive5.com/usearch/manual/pipe_demux.html
As a summary for myself: There are two different dual indexing strategies used by Illumina
For dealing with this type of data in Cutadapt, we need two options.
For combinatorial indexing, an idea could be to allow not only {name}
in the demultiplexing file name template, but perhaps something like {name1}{name2}
, where {name1}
is the name of the adapter (barcode) that was found in R1 and {name2}
is the name of the adapter (barcode) that was found in R2.
For UDI, the --pair-adapters
option suggested in #347 would be necessary.
Have you come up with a solution for the first situation where there are multiple combinations of indices? We have done sequencing like this and I would like to try out your method.
My plan is to implement the idea where you can use something like {name1}{name2}
in the file name templates, as mentioned above. I don’t know when I have time for this, hopefully this month.
For completeness: The second part, which is the --pair-adapters
option, is implemented.
Any update regarding combinatorial indexing?
I’m working on this now, give me a couple more days.
Hi, this is now implemented ("combinatorial demultiplexing"). Please read the new section in the documentation.
It would be great if someone could test it and let me know whether this is what you need before I make a new release. Just follow the instructions for how to install a development version.
Thanks! I can try it next week and let you know how it worked.
Hi, I have Illumina MiSeq reads in the following format forward_barcode_sequence forward_primer_sequence read reverse_primer_sequence reverse_barcode
Because forward barcodes is used in combination with reverse barcodes, searching only the first barcode as cutadapt does right now does not work. Is it possible to add an option to search for both barcodes simultaneously to decide where a read belongs to?