terrimporter / MetaWorks

MetaWorks is a flexible multi-marker metabarcode pipeline for processing paired-end Illumina reads from raw fastq.gz files to taxonomic assignments.
https://terrimporter.github.io/MetaWorksSite/
GNU General Public License v3.0
17 stars 4 forks source link

Can MetaWorks accommodate merged Illumina reads? #13

Closed Jess-Schultz closed 11 months ago

Jess-Schultz commented 1 year ago

Hi there,

I have reads that are demultiplexed but the R1 and R2 reads have already been merged. Our script merges them before demultiplexing because the primers are dual indexed and both barcodes are required to assign the sequence to the correct sample.

Is it okay to run merged demultiplexed reads through the single read ESV workflow? Or is there a better way to process this type of data? Will the presence of both the forward and reverse primers on the sequences mess up any of the downstream steps?

Thank you kindly for your help.

terrimporter commented 1 year ago

MetaWorks expects to start with the raw paired end reads. I would ask for access to the original raw files. MetaWorks has a 'barcoding pipeline' that you can use and you can provide the tag+primer sequence for the 5' and 3' end for each of your samples. There is a sample file in the /testing directory. You absolutely have to trim away both the forward and reverse primers from the sequences because primer binding isn't exact and the resulting sequences will reflect the primers you used not necessarily the original template in those regions.

Jess-Schultz commented 1 year ago

Ok great. I will give that a try. Thanks so much for your help!

Jess-Schultz commented 1 year ago

Hello. Sorry to trouble you. I have a few follow up questions.

1) I am getting this error: KeyError in file /home/innovation-admin/MetaWorks1.13.0/snakefile_dualIndexedSamples, line 136: 'CUTADAPT' File "/home/innovation-admin/MetaWorks1.13.0/snakefile_dualIndexedSamples", line 136, in Does this mean CUTADAPT needs to be defined somewhere in the script before line 136?

2) In the process of trying to troubleshoot, I noticed this line at the top of the snakefile_dualIndexedSamples: "This pipeline processes dual-indexed individual samples, not bulk samples." My samples are indeed bulk samples. They are eDNA samples from marine sediments. Does this mean I should not use this pipeline? I didn't see a script or config file called 'barcoding,' so I assumed it was the dual indexed pipeline.

3) I am wondering if an alternative might be to run the ESV singleRead pipeline on the demultiplexed merged reads using the forward primers, but with the reverse primers trimmed off ahead of time, for example by trimming a fixed number of bps from that end.

Thanks for your help.

terrimporter commented 1 year ago

The barcoding workflow is meant for individual samples and the ESV workflow is meant for bulk samples. In your case, I'd request the raw demultiplexed reads from the sequencing centre. The usual case is for them to provide you with demultiplexed reads where the Illuimina adapters have been removed, sequences assigned to samples, but the primers are still attached. You'd get a set of R1 forward and R2 reads in fastq files. This is what the pipeline is meant to start with. Then you can just edit the adapters_anchored.fasta file so that your primers are removed.

Jess-Schultz commented 1 year ago

Ok thank you.

From: Teresita M. Porter @.> Date: Sunday, October 1, 2023 at 14:26 To: terrimporter/MetaWorks @.> Cc: Jess Schultz @.>, Author @.> Subject: Re: [terrimporter/MetaWorks] Can MetaWorks accommodate merged Illumina reads? (Issue #13)

The barcoding workflow is meant for individual samples and the ESV workflow is meant for bulk samples. In your case, I'd request the raw demultiplexed reads from the sequencing centre. The usual case is for them to provide you with demultiplexed reads where the Illuimina adapters have been removed but the primers are still attached. You'd get a set of R1 forward and R2 reads in fastq files. This is what the pipeline is meant to start with. Then you can just edit the adapters_anchored.fasta file so that your primers are removed.

— Reply to this email directly, view it on GitHubhttps://github.com/terrimporter/MetaWorks/issues/13#issuecomment-1742160729, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ALCNL2NLTPHZAVW2XJPHMYLX5GYW5ANCNFSM6AAAAAA5JTX5XA. You are receiving this because you authored the thread.Message ID: @.***>