chasewnelson / SNPGenie

Program for estimating πN/πS, dN/dS, and other diversity measures from next-generation sequencing data
GNU General Public License v3.0
106 stars 37 forks source link

Using SNPGenie on VCF from RAD-seq #68

Open AudeCaizergues opened 1 year ago

AudeCaizergues commented 1 year ago

Dear developper,

I'm interested in estimating piN/piS for 6 populations of individuals. The individuals were sequenced separately in RAD-seq, the reads mapped on the reference genome and then the variant calling was made via Stacks which gives me a VCF (format 1). I read the part of the manual explaining how to use SNPGenie with VCF format, and I'm still not quite sure I undestand how to use it. First, can SNPGenie be run on several individuals at the same time ? Or should I run it on one individual at the time and then average the piN/piS per population? Second, I see that even using the VCF format, I need to provide a fasta file, but I'm not sure how to obtain this fasta... I only have the fasta of the reads per individuals. Could you please explain what type of fasta I need ? Thank you,

Aude

singing-scientist commented 1 year ago

Dear Aude, apologies that I missed this. Unfortunately I've never known anyone to use SNPGenie for RAD-seq data but in principle if you have a VCF file it should be possible. However, indeed, you'd need a reference FASTA; the site numbers in the VCF would need to correspond to the coordinates of that FASTA; and you'd need a GTF specifying any genes therein. Depending on your starting data, this may require quite a bit of manual/scripting preparation. Let me know if this helps.

Chase