strataG is a toolkit for haploid sequence and multilocus genetic data summaries, and analyses of population structure. One can select select specific individuals, loci, or strata using standard R '[' indexing methods. . The package contains functions for summarizing haploid and diploid loci (e.g., allelic richness, heterozygosity, haplotypic diversity, etc.), and haploid sequences by locus and by strata as well as functions for computing by-site base frequencies and identifying variable and fixed sites among strata. There are both overall and pairwise standard tests of population structure like PHIst, Fst, Gst, and Jost's D. If individuals are stratified according to multiple schemes, these stratifications can be changed with the stratify()
function and summaries or tests can be re-run on the new object. The package also includes wrappers for several external programs like fastsimcoal2, STRUCTURE, and mafft. There are also multiple conversion functions for data objects for other population packages such as adegenet, pegas, and phangorn.
To install the stable version using install.packages requires an extra repo to be made available to the install.packages function prior to install as the strataG is not available via CRAN:
options(repos = c(
zkamvar = 'https://zkamvar.r-universe.dev',
CRAN = 'https://cloud.r-project.org'))
install.packages('strataG')
NB! Make sure that you have installed the development version of the dependency sprex prior to installing strataG
To install the latest version from GitHub including the development version of sprex:
# make sure you have Rtools installed
if (!require('devtools')) install.packages('devtools')
# install sprex development version
devtools::install_github("ericarcher/sprex")
# install strataG latest version
devtools::install_github('ericarcher/strataG', build_vignettes = TRUE)
Vignettes are available on several topics:
To see the list of all available vignettes:
browseVignettes("strataG")
To open a specific vignette:
vignette("gtypes", "strataG")
There is also a tutorial detailing running fastsimcoal2 through strataG available through the function fscTutorial()
.
The paper can be obtained here, and is cited as (preferred):
Archer, F. I., Adams, P. E. and Schneiders, B. B. (2016), strataG: An R package for manipulating, summarizing and analysing population genetic data. Mol Ecol Resour. doi:10.1111/1755-0998.12559
If desired, the current release version of the package can be cited as:
Archer, F. 2016. strataG: An R package for manipulating, summarizing and analysing population genetic data. R package version 1.0.6. Zenodo. http://doi.org/10.5281/zenodo.60416
readGenData()
not recognizing NA
s.fs2gtypes()
not formatting multi-block DNA sequence data as gtypes properlyalleleFreqFormat
, as.array.gtypes
gtypes
object, making it no longer compatible with previous versionsarlequinRead()
so that it will read and parse all .arp files. Added arp2gtypes()
to create gtypes
object from parsed .arp files.dupGenotypes()
.strataGUI()
.na.rm = TRUE
to calculation of mean locus summaries by strata in summary.gtypes
. This avoids NaN
s when there is a locus with genotypes missing for all samples.x
to a data.frame
in df2gtypes
in case it is a data.table
or tibble
.gtypes
object by replacing the @loci
data.frame slot with a @data
data.table slot. The data.table has a id
character column, a strata
character column, and every column afterwards represents one locus. The @strata
slot has been removed.loci
accessor has been removed. as.array
which returns a 3-dimensional array with dimensions of [id, locus, allele].gtypes
objects no longer shows a by-locus summary. The display was getting too slow for data sets with a large number of loci.summary
function now includes by-sample results.maf
to return minimum allele frequency for each locus.ldNe
to calculate Ne.expandHaplotypes
to expand the haplotypes in a gtypes
object to one sequence per individual.read.arlequin
back. Fixed missing function error with write.arlequin
.summarizeSamples
evanno
from base graphics to ggplot2labelHaplotypes
to assign haplotypes if possible alternative site combinations match a present haplotypestrataGUI
) for creating gtypes objects, QA/QC, and population structure analysestype
argument to structurePlot
to select between area and bar chartshaplotypeLikelihoods
to sequenceLikelihoods
neiDa
now creates haplotypes before calculating metricwritePhase
that was creating improper input files for PHASE