Closed cr1570 closed 4 years ago
Would you check the result of genind2df(Data2.genind, usepop = TRUE, oneColPerAll = TRUE)
to see if that produces a data.frame with 0s and 1s? This is the function that genind2gtypes
uses so the conversion may be happening there.
As an aside, it might be easier to use df2gtypes
rather than new
to create your gtypes object. The output shouldn't be different, but it is a bit more convenient function.
Hello,
I am using STRATAG to perform pairwise analyses on populations of a globally distributed shark species. I found a discrepancy in my results depending on how I created the gtypes object. My first method created the gtypes object new (see code below).
I changed my methodology to use the genind2gtypes function later in my analysis to try to streamline my code (see code below).
When I use the second method (genind2gtypes) on the same VCF data with the same population designations, I find no significant differentiation in two comparisons that are significant when I use the first method (gtypes new). I noticed that my first method retains the original genotypes (AGCT) and the second method function converts SNP data to 0s and 1s. I am wondering if perhaps I am losing data in the file conversions?
Can you please offer any insight into this issue?
Thank you, Cassandra
library("strataG") library("pegas") library("vcfR")
Method 1: Create new Gtypes object
1: import data from vcf (using pegas), split alleles
Data1 <- read.vcf("data.vcf") Data1M <- as.matrix(Data1) Data1M.split <- alleleSplit(Data1M, sep = '/')
2: import strata for schemes, make sure row names are consistent in both data frames
Strata <- read_csv("Strata.csv", col_names = TRUE) Data.schemes <- as.data.frame(Strata[, c("Region", "Location")]) rownames(Data.schemes) <- Strata$ID rownames(Data1M.split) <- Strata$ID
3: create gtypes objects with schemes attached
Data1.gtype <- new("gtypes", gen.data = Data1M.split, ploidy = 2, schemes = Data.schemes)
4: stratify gtypes objects
Data1.gtype.loc <- stratify(Data.gtype, "Location")
Method 2: Convert genind object to gtypes
1: import data from vcf (using vcfR) and convert to genind
Data2.vcfr <- read.vcfR("data.vcf", verbose = TRUE) Data2.genind <- vcfR2genind(Data2.vcfr, sep="[|/]")
2: add populations to genind object
popID <- read.table("popID.txt", header=FALSE) pop(Data2.genind) <- as.vector(as.matrix(popID))
3: convert genind to gtypes object
Data2.gtype.loc <- genind2gtypes(Data2.genind)
run pairwise statistics
Fstats1.loc <- popStructTest(Data1.gtype.loc, nrep = 10000, stats = c("fst", "gst.dbl.prime", "d"), quietly = FALSE, max.cores = NULL) Fstats2.loc <- popStructTest(Data2.gtype.loc, nrep = 10000, stats = c("fst", "gst.dbl.prime", "d"), quietly = FALSE, max.cores = NULL)