Closed aureliendejode closed 1 year ago
Hello! Sorry for the delayed response. For paired-end data you do not need to demultiplex your data by hand, ipyrad handles this for you. You will need to rename your F and R raw data files so that the forward reads include the string R1 and the reverse reads include the string R2 and then you can set the params in the params file like this:
pairddrad_example_R*_.fastq.gz ## [2] [raw_fastq_path]
pairddrad ## [7] [datatype]: Datatype (see docs): rad, gbs, ddrad, etc.
TGCAG,CGG ## [8] [restriction_overhang]: Restriction overhang (cut1,) or (cut1, cut2)
Of course you will need to provide the path to the barcodes file in parameter 3 (barcodes_path) and you will need to change the restriction_overhang
sequences to be the ones you actually used.
The R2 overhang sequences is actually pretty easy to find because it is the first few bases in the R2 file. Here is the example R2 file which uses CGG. Your R2 data should look similar, though the overhang sequence may differ:
@lane1_locus0_2G_0_0 2:N:0:
CGGGGTTAAGAGGCCAGTTAACTGCAGCGGGATCGCGCACCATAGCGGCCGTGCCTACGAGTCAGATGTCACTTTTCAGACGCTCATGGAAGTGAGTGCA
+
BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
@lane1_locus0_2G_0_1 2:N:0:
CGGGGTTAAGAGGCCAGTTAACTGCAGCGGGATCGCGCACCATAGCGGCCGTGCCTACGAGTCAGATGTCACTTTTCAGACGCTCATGGAAGTGAGTGCA
+
BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
@lane1_locus0_2G_0_2 2:N:0:
CGGGGTTAAGAGGCCAGGTAACTGCAGCGGGATCGCGCACCATAGCGGCCGTGCCTACGAGTCAGATGTCACTTTTCAGACGCTCATGGAAGTGAGTGCA
+
BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
Give that a try and let us know how it goes.
Because this is more of a support question than an 'issue' with ipyrad I am going to close this ticket, but I would encourage you to continue asking questions about how to do your assembly on our ipyrad gitter channel, which is much better for support requests like this:
https://app.gitter.im/#/room/#dereneaton_ipyrad:gitter.im
All the best, -isaac
Hello,
We are analyzing ddrad paired-end data with ipyrad. Is there a tutorial specifically dedicated to that particular type of data ? We have data in F and R fastq files that need to be demultiplexed. We have the barcodes and found the overhang for the Forward reads and managed to demultiplex those, but it is not super clear how to proceed with the reverse reads. It also appears that the overhang for those reads is not easily identified.
Any insights ?
Best
Aurélien