Closed sr320 closed 11 months ago
see also https://rpubs.com/sr320/1063532
I used ngsRelate in the epi-gen oyster paper, with genotype likelohoods from angsd. It can take a gzipped VCF file with version 2. Based on your VCF file header, it looks like you don't have a PL genotype likelihood value so you will want to use the called genotypes:
./ngsrelate -h my.VCF.gz -T GT -c 1 -O vcf.relatedness
Relatedness depends on the overall allele frequency for each SNP. ngsRelate will calculate that directly from the VCF, but if you have allele frequency for the SNPs from a larger set of samples, you can provide that with the -f
parameter.
And R code to get it into a matrix, you might need to set the colnames and rownames of the distrab before saving:
library(spaa)
df = read.table("vcf.relatedness",header = T)
dfrab <- df[,c("ida","idb","rab")]
distrab <- as.matrix(list2dist(dfrab))
write.table(distrab,file="MATRIX_mbd_rab.txt", col.names = F, row.names = F, sep = "\t")
ngsrelate produced a table with the column titles as follows
a b nSites J9 J8 J7 J6 J5 J4 J3 J2 J1 rab Fa Fb theta inbred_relatedness_1_2 inbred_relatedness_2_1 fraternity identity zygosity 2of3_IDB FDiff loglh nIter bestoptimll coverage 2dsfs R0 R1 KING 2dsfs_loglike 2dsfsf_niter
While I see rab
- "ida", "idb"
are not present (not sure what any of them are 😄 )
I think you want a and b, those are the indices for the individuals in the analysis. They are ordered in the same order as in the vcf. Rab is relatedness metric. https://github.com/ANGSD/NgsRelate#output-format
I am trying to get genetic related matrix for 26 individuals. I am fairly confident in a merged, filltered, VCF file created, but not clear on how to take it to a genetic relatedness matrix.
Here are some effort in area... https://d.pr/WyABeO specifically section 2.1.1
and suggestion, advice, pipelines to follow would be appreciated.