dereneaton / ipyrad

Interactive assembly and analysis of RAD-seq data sets
http://ipyrad.readthedocs.io
GNU General Public License v3.0
72 stars 41 forks source link

Optimal Min length of reads after adapter trim for paird-end ddRAD data. #519

Closed AliBasuony2022 closed 1 year ago

AliBasuony2022 commented 1 year ago

Dear Issac,

I have a paired-end ddRAD (100 bp for each read) and I'm just confused what is the optimal value to be used in step 17 "Min length of reads after adapter trim" I have done demultiplexing step (step 1 in ipyrad pipeline) including adaptor removal using a nother piepline and started from step 2 (I just uploaded the fastq files).

I have tried both 180 and 90 as a minimum length of reads after adapter trim and I got loci with a relatively low variable length with 180,where a very high varaiable length with 90. Shall I stick to the default "35"?

I'm just using <.loci> output to run an R script to test for cross-contamination of two closely related species.

My understanding is the more length of the loci I have the more robust is the result - short length loci could be conserved and might show a false cross-contamination result.

Any advices will be highly appreciable.

Kind regards, Ali

isaacovercast commented 1 year ago

Hi Ali,

The filter_min_trim_len parameter is for removing very short reads after trimming which might not have much good information in them. If you set this value too high you will end up removing reads which contain useful information.

I'm not sure how you're establishing whether a read is contamination or not, but I don't think read length is a good indicator of contamination, either way. But then I don't know what you're doing so i may be wrong.

Since this is more of a question than an ipyrad issue i'm going to close this. In the future it might be better to post questions to our gitter channel, if that works for you: https://app.gitter.im/#/room/#dereneaton_ipyrad:gitter.im

all the best, -isaac

AliBasuony2022 commented 1 year ago

Thanks Isaac!