Open gregorgorjanc opened 1 year ago
I checked the createDrones
and getGv
functions and we don't do that in any of them, so I'm guessing we forgot. This should probably be implemented when we create drones, right?
Correct, IF we wanted to do anything, then the best place would be in createDrones()
. I guess, we can just go and divide by 2 and we will get the haploid value.
BUT, should we really do this? There are points for this change and against it:
PRO
createDrones()
such that gv are divided by 2genParam()
(many variance and quantitative genetics functions use genParam()
under the hood!) since that function works with underlying QTL genotypes - there will be a lot of work there (in R and C++!) - maybe we should do something on the SimParam
side, say in defining traits addTrait*()
?CON
genParam()
(many variance and quantitative genetics functions use genParam()
under the hood!) will be challengingA strategy working with genParam()
would be to talk to Chris and see if we perhaps can pass an argument to it to take policy into account - will ask Chris about his thoughts on this.
We should also think about environmental variances! Is environmental variance in drones the same as in workers and the queen? I think we don't know, also, traits will likely differ substantially between these three caste, sometimes they are only expressed in one caste and and not another!
Should we maybe have some information/parameter in SimParam
and SimParamBee
? The question behind this issue arose in my head while I was reading the SLiM manual (https://messerlab.org/slim, https://github.com/MesserLab/SLiM/releases/download/v4.0.1/SLiM_Manual.pdf) and how it handles sex chromosome and fitness calculation for XX females and XY males (SLiM has a parameter directive (like SimParam
) to manage this).
I think all these questions opened a can of worms!? Lot's of unknowns here.
The key point and question for now is that we should discuss what to do in the intermediate period before we do the change/fix - because we are currently wrongly "hidding" this behaviour. Let's talk in one of the meetings on what to do or shoot ideas here!
Sneaking on to your issue page here, because Gregor mentioned how useful you guys find it and I noticed this thread.
I should mention that due to the way the AlphaSimR handles different levels of ploidy, there won't be any difference in genetic values or variances between a haploid individual or a fully inbred diploid individual with the same alleles (or any higher level of ploidy). Under the hood, AlphaSimR is working with relative dosage for genetic effects and not raw dosage. For example, a homozygous genotype will always have a
or -a
for its additive effect. This means that dividing the genetic value by 2 is not correct for AlphaSimR.
Here is an example:
library(AlphaSimR)
founderPop = quickHaplo(100, 10, 100, inbred=TRUE)
SP = SimParam$
new(founderPop)$
addTraitADE(100, meanDD=0.2, relAA=0.1)
diploid = newPop(founderPop)
haploid = reduceGenome(diploid)
varA(diploid) # 1
varA(haploid) # 1
varD(diploid) # 0
varD(haploid) # 0
varAA(diploid) # ~0.2 (2*0.1 due to being fully inbred)
varAA(haploid) # ~0.2 (2*0.1 due to being fully inbred)
mean(gv(diploid) - gv(haploid)) # 0
@gaynorr thanks for coming here!
The issue is that we are simulating drones as doubled haploid (—> diploid) individuals! We did this so that we could easily mate a queen (diploid) with these drones. We initially tried implementing them as haploid (we still have that code in the inactive background), but some downstream functionality became challenging - see https://github.com/HighlanderLab/SIMplyBee/issues/24. Any advice on this would be most welcome!
Yeah, the genetic value of those diploids will be the same as if you were doing haploids instead so my point was that there is no concern there.
If you really want haploids, you can do it with some combination of reduceGenome
, doubleGenome
, and/or mergeGenome
without having to touch c++ functions. Below is an example of these functions.
The catch with this approach is that AlphaSimR will create a new individual with each of these steps and thus a new pedigree entry. That means the internal AlphaSimR pedigree won't look like an actual bee pedigree. You'll get "individuals" that are really just gametes. Everything else in the package should work as expected. Of course the pedigrees aren't correct for bees with the DH trick either, because the actual pedigrees just track queens and ignore drones.
If it were me, I'd probably focus on processing the pedigree. Whether you want diploid or haploid drones is mainly just an aesthetic choice.
If you want something that is truly faithful to what is happening. You could use reduceGenome
to create drones and create your own function for mating. The function for mating can be based off of mergeGenome
. The main difference is that it'd need to sample gametes from the females before merging. Sampling gametes from the females can be done by calling the c++ function createReducedGenome
. This is the function used by reduceGenome
to perform this task. You'd want to call the c++ function here instead of using reduceGenome
, because it will give you control over how the pedigree is handled.
library(AlphaSimR)
founderPop = quickHaplo(2,1,10,inbred=TRUE)
SP = SimParam$new(founderPop)$setTrackRec(TRUE)
diploid = newPop(founderPop)
haploid = reduceGenome(diploid)
newDiploid = doubleGenome(haploid)
mergedDiploid = mergeGenome(females=haploid,
males=haploid,
cbind(1:nInd(haploid),
1:nInd(haploid)))
SP$pedigree
pullIbdHaplo(diploid)
pullIbdHaplo(haploid)
pullIbdHaplo(newDiploid)
pullIbdHaplo(mergedDiploid)
@gaynorr hah, yeah this code of yours works like a charm. I will have to go back into https://github.com/HighlanderLab/SIMplyBee/issues/24 to understand what was bugging me at that time. Maybe the new pullIbdHaplo()
background implementation now works with variable ploidy, while before it worked only with even-ploidy.
Of course the pedigrees aren't correct for bees with the DH trick either, because the actual pedigrees just track queens and ignore drones.
Hmm, why do you say this? Drones come from the queen only, which is what AlphaSimR does track in the following way
library(AlphaSimR)
founderPop = quickHaplo(2,1,10)
SP = SimParam$new(founderPop)$setTrackPed(TRUE)
queens = newPop(founderPop)
SP$pedigree
DH = makeDH(queens)
SP$pedigree
where the last output is
mother father isDH
1 0 0 0
2 0 0 0
3 1 1 1
4 2 2 1
Aha, you mean that father
should be 0
for the DH, right?
@gaynorr we've been testing this code and how it reflect on the genetic values and variances. We noticed that the variance of haploids and double haploids is twice the variance of the diploids - see the code below. Is this ok?
library(AlphaSimR)
founderPop = quickHaplo(nInd= 1000,1,10)
SP = SimParam$new(founderPop)$setTrackRec(TRUE)
SP$addTraitA(nQtlPerChr = 10, mean = 0, var = 1)
diploid = newPop(founderPop)
var(diploid@gv)
nrow(diploid@gv)
diploid2 <- randCross(diploid, nCrosses = 1000)
var(diploid2@gv)
haploid = reduceGenome(diploid)
var(haploid@gv)
nrow(haploid@gv)
newDiploid = doubleGenome(haploid)
var(newDiploid@gv)
mergedDiploid = mergeGenome(females=haploid,
males=haploid,
cbind(1:nInd(haploid),
1:nInd(haploid)))
var(mergedDiploid@gv)
@janaobsteter, that would be correct because the initial population is roughly in Hardy-Weinberg equilibrium and the later are fully inbred (e.g. Va = 2(1+F)pqa). If you add inbred=TRUE
to your quickHaplo
function, they will approximately match up.
@gregorgorjanc , adding the figures
Wheninbred = FALSE
When inbred = TRUE
@gaynorr above are plots for genetic values of diploids and haploids generated from the diploids using reduceGenome
(left column) alongside sum of allele dosages (right column). @janaobsteter shows this for the case where diploids are not inbred (top 2x2 plot) and when they are fully inbred (bottom 2x2 plot).
These plots are generated using the same code as above in the thread - @janaobsteter maybe you add the code to the post for reproducibility.
In the inbred case (bottom) the variance of genetic values in inbred diploids and haploids is the same (about 1). I find this somewhat confusing but I am struggling to put my confusion to words! Let's see. Diploid inbred var is 2pq\alpha(1+F)
--> 1 in our case, but for haploids we should have pq\alpha
, right, though this value would be 1/2 less going from diploid to haploid AND we don't have inbreeding anymore so another 1/2 reduction leading to 1/4, yet the value we have is actually ~1. Is this all due to the used scaling across ploidy levels in AlphaSimR (Table 1 in https://cran.r-project.org/web/packages/AlphaSimR/vignettes/traits.pdf)? Maybe we should add a haploid column to that table too!? I am just likely daft what is going on and would appreciate if you can explain to us what is going on.
In the non-inbred case (top) the variance of genetic values in non-inbred diploids is about 1/2 and haploids is about 1. While I struggled in the above inbred case, here I am really confused.
@gregorgorjanc and @janaobsteter, the scaling of genotype dosage is causing these patterns. In diploids, a
equals the additive genetic effect for changing one allele from the "0" to the "1" allele. This is a change of half the total dosage. In haploids, the effect of change one allele from the "0" to "1" equals 2a
, because this is a change of the full dosage or twice the change you see in diploids.
I think the system in AlphaSimR is easiest to think about when considering increasing the ploidy level. In AlphaSimR, if you double an individual's chromosomes their genetic values remain unchanged. This is because the relative dosages are unchanged. A population having its chromosomes doubled will see no change in its genetic variances.
An alternative way to handle ploidy would be to use absolute dosage instead of relative dosage. Meaning, you consider a
to represent the effect of a change of one "0" allele to a "1" allele. This is the more common formulation used in polyploid literature. This formulation would see genetic values change when you doubled the chromosomes. The direction and size of the change isn't the same for all individuals. However, at a population level the change will result in an increase in genetic variance by a factor of 4.
Neither formulations is biologically sound, because changing ploidy isn't going to be the same for all traits and species. I'm using the AlphaSimR formulation because I thought it made dealing with dominance degrees more intuitive. I also expect there could be some cases where it is approximately biologically correct. However, I would in general caution against comparing individuals with different ploidy levels.
Silly question, but I am travelling and can’t easily access SIMplyBee code - how do we deal with drone quantitative values - do we take 0.5 of each value doubled-haploid (hence diploid fully inbred) or we forgot to do this?