thibautjombart / adegenet

adegenet: a R package for the multivariate analysis of genetic markers
165 stars 64 forks source link

SNP weight values #357

Open Tannervanorden opened 8 months ago

Tannervanorden commented 8 months ago

I am working with adegenet and I am wondering about if there is a way to calculate SNP weight values. I am working with covariance matrices produced by ANGSD and PCangsd. ANGSD outputs a covariance matrix without any SNP weights. PCangsd has the ability to output SNP weights. It would be really cool if I could output the SNPs with the highest weights for each discriminant function. The code I am currently running is posted at the bottom.

Thanks so much, Tanner

library(adegenet) library(tibble)

set.seed(123)

Load the covariance matrix. Don't forget to change the file path if you have downloaded the data to your own computer.

cov = as.matrix(read.table("/Users/tannervanorden/Dropbox/Cutthroat_Research/PCA_Graphs/Original_updated_errorProof.covMat"), header = F)

We will also add a column with population assignments

Hendrys <- rep("Hendry's Creek", 42) PineR <- rep('Pine & Ridge Creeks', 29) Mill <- rep("Mill Creek", 18) Trout = rep("Trout Creek", 4) Mountain_Dell = rep("Mountain Dell", 2) Birch = rep("Birch Creek", 3) Willard = rep("Willard Creek", 30)

population_labels <- c(Hendrys, PineR, Mill, Trout, Mountain_Dell, Birch, Willard)

Perform PCA

mme.pca <- eigen(cov)

Extract eigenvectors and combine with population assignments

eigenvectors = mme.pca$vectors pca_data <- as_tibble(cbind(population_labels, data.frame(eigenvectors)))

dapc_pops <- dapc(pca_data[, -1], grp = population_labels, perc.pca = 100, n.da = 4)

a_score_species <- optim.a.score(dapc_pops, k = 7) ##Optimum number of principle components for 4 discriminant functions is 75 for this data

Perform DAPC on the PCA results

dapc_result <- dapc(pca_data[, -1], grp = population_labels, n.pca = 75, n.da = 4)