atcg / clustOpt

Scripts for the optimization of RADseq clustering thresholds in population genetics
MIT License
9 stars 0 forks source link

f1 and f2 loci #2

Open k-sanchez opened 5 years ago

k-sanchez commented 5 years ago

hello, I am trying to follow the pipeline to find the optimal cluster threshold. For this I am using ipyrad 0.7.30 and the output of step5 (s5.consens.txt) does not have a column named f1loci nor f2loci, so which column could be the equivalent to the f1/2 columns of the output from step5 that you obtained? s5_consens_stats.txt

k-sanchez commented 5 years ago

sorry, I' ve no clarified that i'm running the first step of the protocol: Fraction of inferred paralogues

atcg commented 5 years ago

Thanks very much for the good question. I would think "filtered_by_maxH" would be the closest in meaning.

mergi-2674 commented 3 years ago

Hi, I have the same question as K-sanchez. I don't understand your answer atcg. for which loci does "filtered_by_maxH" stands for f1loci or f2loci? can you make it clear, please? thank you!

VARR96 commented 1 year ago

Hello, I am trying to solve the same problem. I understand that "filtred_maxH" is the same as performing the operation: f1loci-f2loci, right? However, to get the metric % percentage of loci marked as paralogs we need to do the operation: (f1loci-f2loci)/f1loci. The problem is when you use ipyrad, where can we get that data from?

Thank you

vic,