Open gregorgorjanc opened 1 year ago
This script shows general use of GS in AlphaSimR.
Here is information on using the fixed effect slot in populations.
Here are some general comments on GS in AlphaSimR.
Several models are supported:
RRBLUP
and RRBLUP2
).fastRRBLUP
). This is just an implementation of an older version of AlphaBayes that matches the version released in the original AlphaSim. Despite it's name, it is not always faster than the previous ridge regression implementations. It's primary benefit is that it has a low memory usage due to a mixed precision implementation that avoids storing all loci in floating point format at any one time.RRBLUP_D
and 'RRBLUP_D2`). These models use genotypic coding of loci to get additive and dominance effects. Back solving is used to get at the breeding values of individuals using either the training population's genotype frequencies or the frequencies of a prediction population. These models also fit a directional dominance term as fixed effect to try to improve prediction accuracy of the dominance effect.RRBLUP_GCA
and RRBLUP_GCA2
). These models fit random effects for the mother and father of an individual using the haplotype information. The models were intended for modeling GS in plant breeding where hybrids are used in the training population and we are fitting the inbred genotypes of the parents as two separate random effects. This function could be used to model crossbred animals or hybrids in outbred crops too, but it assume perfect assignment of haplotypes which is overly favorable.RRBLUP_SCA
and RRBLUP_SCA2
). These models effectively combine the features of the previous two models to fit genotypic effects with the ability to back solve for gender specific breeding values (GCA). Like the GCA models, these models are targeted for plant breeding applications where hybrids are produced from inbred lines.You'll notice that the models list above tend to a variant with a '2' in its name. This alternative model uses a simple EM approach to estimate variance components that is very efficient when the number of markers is much smaller than the number of individuals in the training population. The original models were designed to be efficient when the number of markers is larger than the number of individuals in the training population.
The aim here is to show: