Closed joetabet closed 2 years ago
Hi @joetabet,
Maybe post an reproducible example and we can turn that into an improved documentation as we explain it to you here?
Yes of course, this is a very short example ,
`FounderGenomes = quickHaplo(nInd = 1000,nChr = 2,segSites = 500 ) SP = SimParam$new(FounderGenomes) SP$addTraitA(nQtlPerChr = 50, mean = 0, var = 2) SP$setSexes("yes_sys")
SP$setTrackRec(TRUE) Pop = newPop(FounderGenomes) Pop1 = randCross(pop = Pop, nCrosses = 100, nProgeny = 2)` SP$recHist
Thank you for your time
@joetabet I simplified the example to this
> FounderGenomes = quickHaplo(nInd = 4, nChr = 3, segSites = 10)
> SP = SimParam$new(FounderGenomes)
> SP$setSexes("yes_sys")
>
> SP$setTrackRec(TRUE)
> Pop = newPop(FounderGenomes)
> Pop1 = randCross(pop = Pop, nCrosses = 2, nProgeny = 3)
> SP$pedigree
mother father isDH
1 0 0 0
2 0 0 0
3 0 0 0
4 0 0 0
5 4 1 0
6 4 1 0
7 4 1 0
8 2 3 0
9 2 3 0
10 2 3 0
So, we have 4 founders and I make two crosses (4 with 1 and 2 with 3) each producing 3 progeny, so we then have 6 progeny total in Pop
> SP$recHist
[[1]]
[1] 1 2
[[2]]
[1] 3 4
[[3]]
[1] 5 6
[[4]]
[1] 7 8
[[5]]
[,1]
[1,] list,2
[2,] list,2
[3,] list,2
[[6]]
[,1]
[1,] list,2
[2,] list,2
[3,] list,2
[[7]]
[,1]
[1,] list,2
[2,] list,2
[3,] list,2
[[8]]
[,1]
[1,] list,2
[2,] list,2
[3,] list,2
[[9]]
[,1]
[1,] list,2
[2,] list,2
[3,] list,2
[[10]]
[,1]
[1,] list,2
[2,] list,2
[3,] list,2
SP$recHist
tracks all recombinations across all individuals ever created. So, 10 in our example.
> is(SP$recHist)
[1] "list" "vector" "listOrNULL"
SP$recHist
is a list so lets dive into individual 5
> SP$recHist[[5]]
[,1]
[1,] list,2
[2,] list,2
[3,] list,2
There is one list node for every chromosome we simulated (3 in the example). Let's loot into the 1st chromosome.
> SP$recHist[[5]][[1]]
[,1]
[1,] integer,4
[2,] integer,6
OK, this is diploid situation, so this individual has two chromosome copies. Let's look at the the 1st copy:
> SP$recHist[[5]][[1]][[1]]
[,1] [,2]
[1,] 2 1
[2,] 1 10
Now, this is where I am blanking out now. I think that every row is a recombination. Think of going from left-to-right in parental chromosomes and generating progeny chromosome. The first row indicates which parental chromosome should we start with (the 1st column - and since it's a start the value in the 2nd column is 1), then at position 10 there was a recombination to the other parental chromosome.
Similarly for the other copy:
> SP$recHist[[5]][[1]][[2]]
[,1] [,2]
[1,] 2 1
[2,] 1 2
[3,] 2 6
You might think that the start seems to always be 2nd parental chromosome, but looking at other progeny you will see that this is not the case
> SP$recHist[[6]][[1]][[1]]
[,1] [,2]
[1,] 2 1
[2,] 1 7
> SP$recHist[[6]][[1]][[2]]
[,1] [,2]
[1,] 2 1
[2,] 1 6
> SP$recHist[[7]][[1]][[1]]
[,1] [,2]
[1,] 1 1
> SP$recHist[[7]][[1]][[2]]
[,1] [,2]
[1,] 2 1
[2,] 1 5
> SP$recHist[[8]][[1]][[1]]
[,1] [,2]
[1,] 2 1
> SP$recHist[[8]][[1]][[2]]
[,1] [,2]
[1,] 1 1
> SP$recHist[[9]][[1]][[1]]
[,1] [,2]
[1,] 1 1
> SP$recHist[[9]][[1]][[2]]
[,1] [,2]
[1,] 2 1
> SP$recHist[[10]][[1]][[1]]
[,1] [,2]
[1,] 2 1
> SP$recHist[[10]][[1]][[2]]
[,1] [,2]
[1,] 1 1
[2,] 2 5
[3,] 1 7
@gaynorr did I miss anything? Let me know so I can correct and prepare a pull request with a doc update for this bit.
@gregorgorjanc Thank you so much for the clarification. So basically what I understood was ,
SP$recHist[[5]][[1]][[2]] [,1] [,2] [1,] 2 1 [2,] 1 2 [3,] 2 6
[[5]] individual [[1]] first chromosome [[2]] second copy of the first chromosome.
Now here is where things get a bit confusing, basically we are looking at segsites that are in recombination? This means that the first column [,1] is from the sire and [,2] dams (parenteral) and that for the second copy of the chromosome site#2 from father and site #1 from mother were linked, and so one site crossed over and so on? for second-row site #1 from father and site #2 from mother are together? Or am I missing something? I am sorry for all the trouble, but is there a way of calculating the rate? And also, when I add a trait with QTL per chromosome, how can I see study the recombination then? Thank you
@joetabet the above rows show ORIGIN of chromosome segments (paternal or maternal chromosome of the second chromosome copy - of the first chromosome pair - in individual 5) and their START position (hence the end position is start - 1).
We read the above as:
I don't understand what you mean by the rate. Recombination rate? We specify expected recombination rate in runMacs(). The default, I think!, is 1 Morgan chromosome, but check the documentation. Each chromosome will obviously have only few recombinations.
I am also not clear what do you mean by trait, QTL, and recombination. Can you rewrite the question? It looks though that your question is not directly related to the original question. Perhaps you should start a discussion topic on this site - not an issue;)
Thanks for your quick answer Gregor.
SP$recHist[[5]][[1]][[2]] [,1] [,2] [1,] 2 1 [2,] 1 2 [3,] 2 6
So then in this example, [,1] = 1 means the segment was inherited from the mother's paternal chromosome and 2 means that it was taken from the mother's maternal chromosome?
@jamonterotena [, 1]
and [, 2]
denotes columns of a matrix and [1,]
, [2, ]
, and [3,]
denote rows of a matrix - this is how R prints out a matrix when there are no row and column names. To help you understand the behaviour, I invite you to print out haplotypes (pullSegSiteHaplo()
and pullIbdHaplo()
) so you can convince yourself what is going on.
Followed your instructions. This is an example of the haplotype of an individual, 25.
Though not seen here, I checked that the sequence 25_1 descends from 25's mother (grandmother 4, grandfather 14) and 25_2 from its father (grandmother 15, grandfather 13).
Looked at the recHist list for 25. It is obvious that [[1]] accounts for the crossovers between the mother's chromosomes and [[2]] the same for the father. Just by comparing with the IBD state of the alleles in previous picture, this becomes obvious. Then, the rows with 1 in the first column account for segments inherited from the maternal chromosome of the respective parent.
However, then I remember that the function 'countRecomb' in the script 'simulate_chromosome.R' utilized [[1]] as the paternal side and [[2]] as the maternal side:
_recHist <- SP$recHist countRecomb <- function(index) { ind = recHist[[index]] if(!is.null(ind)) { pat = ind[[1]][[1]] values_pat = nrow(pat) - 1 mat = ind[[1]][[2]] values_mat = nrow(mat) - 1 values_total = values_pat + values_mat return(c(index, values_pat, values_mat, valuestotal)) } return(c(index, NA, NA, NA)) } tmp = lapply(201:length(recHist), countRecomb)
This is probably what confused me in the first place. Maybe I am missing something but could you check this out? Sometimes when coding it is inevitable to make this kind of mistakes.
@jamonterotena where is the simulate_chromosome.R
script?
it's in the github repository mentioned in the article: https://github.com/gaynorr/AlphaSimR_Examples/blob/master/recombination_simulations/simulate_chromosome.r
This is how the function is actually written: _recHist <- SP$recHist tmp = lapply(1:length(recHist), function(index) { ind = recHist[[index]] if(!is.null(ind)) { pat = ind[[1]][[1]] values_pat = nrow(pat) - 1 mat = ind[[1]][[2]] values_mat = nrow(mat) - 1 values_total = values_pat + values_mat return(c(index, values_pat, values_mat, valuestotal)) } return(c(index, NA, NA, NA)) })
Hmm, looks like a buglet to me, right @gaynorr ?
Yeah, it looks like that script has the paternal and maternal chromosomes flipped. It's always maternal chromosomes first in AlphaSimR.
Hello, I am using the function SimParam$setTrackRec() and SP$recHist. I am wondering is there some literature about the output of the function, to better understand my results. Thanks !