alexharkess / PBIO_6550

7 stars 1 forks source link

how to calculate the TajimaD using vcftools?could you share your code?Any help will be greatly appreciated,thanks. #1

Open vitty0513 opened 4 years ago

Lilili2017 commented 4 years ago

I don't know how to use vcftools to estimate Tajima's D directly. I will show you how I did. 1: estimate nucleotide diversity for each snp: vcftools --vcf - --site-pi --positions SNP_list.txt --out output Pi of the gene: Then just sum pi across loci. 2: estimate Tajimas' D for each gene in R: a1_fun<- sum(1/c(1:(n-1))) ##n is the samples size of the population theta=S/a1 ####S is the segregating sites which you can get from vcftools b1=(n+1)/(3(n-1)) b2=2(n^2 + n + 3)/(9n(n-1)) c1 = b1-1/a1 c2 = b2 - (n+2)/(a1n) + a2/a1^2 e1 = c1/a1 e2 = c2/(a1^2+a2) D=D<-(pi-S/(a1L)/(e1S+e2S*(S-1))^0.5 ###L is the length of the gene.

Hope it can help you.