dereneaton / ipyrad

Interactive assembly and analysis of RAD-seq data sets
http://ipyrad.readthedocs.io
GNU General Public License v3.0
72 stars 40 forks source link

min_sample_locus and popfile #379

Closed giorgio-92 closed 4 years ago

giorgio-92 commented 4 years ago

Hello everyone, I'm working on a dataset of 55 samples. I've set a min_sample_locus (params n=21) of 44 and the programs give me back good results (about 70 loci). But I need to work in treemix, so i did pop_asign_files setting a min_sample_locus different for each population value (about 3/4 of the samples must have the locus) in the last row. In this case the program gives me back bad results with a lower number of loci. I'd like to get the same results of the first analysis how can i do that?

isaacovercast commented 4 years ago

Hello,

This is not exactly an issue with ipyrad but is more of a question about parameter settings, so is better for the gitter channel. I will answer your question here, but please if you have follow-up post it to the gitter channel.

1) Typically it's good to post things like your params file and also the pop assign file, if you are using this. Also, it helps to show results from the step 7 outfiles directory. 2) 70 loci is actually very very few, I always council against setting very high values for the min_sample_locus parameter. 3) If you set a min_sample_locus with the pops file for each population then of course you will always return fewer loci. You are applying a more restrictive filter. If you want to recover more loci then reduce the min_sample_locus cutoff for each population, it's that simple.

Hope that helps.

giorgio-92 commented 4 years ago

Sorry, I'll write on gitter! I've already try to sett at minimum the min_sample_locus cutoff for each population(n=1) but it still recover few loci (only 32). Thanks