chasewnelson / SNPGenie

Program for estimating πN/πS, dN/dS, and other diversity measures from next-generation sequencing data
GNU General Public License v3.0
102 stars 37 forks source link

Variance input format of vcf #4

Closed HamletShaoE closed 8 years ago

HamletShaoE commented 9 years ago

Hi, I notice that snpgenie takes CLC or Geneious format for input, how about take vcf format into consideration, since human and fly have well data in this format. thanks

singing-scientist commented 9 years ago

Hello, HamletShaoE! As suggested by our recent paper in Bioinformatics, we are in the process of incorporating VCF, and hope to have this accomplished in about a week's time. Thanks for your feedback, and we hope to have this addressed very soon! Best wishes.

singing-scientist commented 8 years ago

There are essentially two kinds of VCF files we are trying to incorporate. The first is a VCF file which summarizes multiple individual sequencing experiments. In such a case, in the "INFO" column, there is typically an "NS=" (number of samples), and an "AF=" (allele frequency) entry. On the other hand, we also have people who have performed single-run pooled-sequencing experiments, in which case there's usually a "DP4=" entry in the "INFO" column, containing the number of reference and allele forward and reverse reads. Do your data fit one of these situations? I not, I'll want to make sure that your situation is included.

singing-scientist commented 8 years ago

The latest SNPGenie incorporates two versions of VCF. Please let us know if your format differs from these by opening a new Issue.