dereneaton / ipyrad

Interactive assembly and analysis of RAD-seq data sets
http://ipyrad.readthedocs.io
GNU General Public License v3.0
72 stars 40 forks source link

Restricting the length of loci in .loci output to specific length #518

Closed AliBasuony2022 closed 1 year ago

AliBasuony2022 commented 1 year ago

Dear Isaac,

I hope you are in a good health.

Is there any way to get a specific lenght of loci in <.loci output> in step 7? I tried both denovo and reference mapping and I got a varaiable length of loci. I have paird-end ddRAD data of 100 bp each, thus I expect to have a loci of ~193 bp after triming the largets barcodes which is 8 bp. I don't know at which step of the pipeline I can specifiy this parameter.

I have looked at one of the tutorial of single-end RAD (https://ipyrad.readthedocs.io/en/master/tutorial_intro_cli.html) which produced <.loci output> of a relatively similar length.

Attached are param-file, few lines of <.loci> and the script.

Kind regards, Ali

Ali_ddRAD_denovo_plate1and2_R1and2_no_zerda.zip

isaacovercast commented 1 year ago

Hello Ali,

Paired-end rad data can end up having different length loci for lots of reasons, for example when adapter contamination gets trimmed or when R1 and R2 overlap and are merged during step 3. It is highly normal for final loci to be of different lengths for PE data, for these and several other reasons.

Hope that helps! -isaac

AliBasuony2022 commented 1 year ago

Thanks so much Issac! You are right, sounds logical.

I'm just using <.loci> output to run an R script to test for cross-contamination and I wanted to make sure it's right to carry on with loci with different length.

Thanks again Ali