caitiecollins / treeWAS

treeWAS: A Phylogenetic Tree-Based Tool for Genome-Wide Association Studies in Microbes
Other
94 stars 18 forks source link

how to construct a snp matrix #60

Closed ShiminYang97 closed 2 years ago

ShiminYang97 commented 2 years ago

I call snp with snippy,and return a core. vcf . But i don't konw how to construct a snp matrix with the vcf.

caitiecollins commented 2 years ago

For anyone needing to convert a vcf file (e.g., a core.vcf file, as produced by snippy) into a snps matrix (as needed for input into treeWAS), the following steps should do the trick:

## Install & load vcfR package:
install.packages("vcfR", dep=T)
library(vcfR)

## Read data from file (REPLACE "core.vcf" with YOUR FILENAME):
vcf <- read.vcfR("core.vcf", verbose = FALSE)

## Convert:
dna <- vcfR2genind(vcf)
snps <- dna@tab
# str(snps)

## Remove redundant binary columns:
suffixes <- keepLastN(colnames(snps), n = 2)
# table(suffixes)
snps <- get.binary.snps(snps, force=TRUE)
# str(snps)