magnusdv / forrel

Forensic pedigree analysis and relatedness inference
GNU General Public License v2.0
10 stars 0 forks source link

Simulation output #2

Closed thoree closed 5 years ago

thoree commented 6 years ago

Below is an example where likelihood calculations are compared to simulations. For me the comparison would have been simpler if genotypes from simpleSim and markerSim had been sorted so, i.e not giving both 1/2 and 2/1. Or is there is a simple way to deal this? gsub below seems clumsy:

library(pedtools)
library(pedprobr)
library(forrel)
p = c(0.5, 0.5); lp = length(p); alleler = 1:lp
Rfe = 0.05; Rma = 0.05 
Mfe = mutationMatrix(alleler, "eq", Rfe)
Mma = mutationMatrix(alleler, "eq", Rma)
z = fullSibMating(1)
l = leaves(z)
z = addChildren(z, father = l[1], mother = l[2], sex = 2, nch=1)
m = marker(z, alleles = alleler, afreq = p, mutmat = list(female = Mfe, male = Mma))
chrom(m) = "X"
l = leaves(z)
one = oneMarkerDistribution(z, m, ids = l)
Nsim = 10000
sim1 = markerSim(z, N = Nsim, partialmarker = m, available = l )
# sim1 = simpleSim(z, Nsim, alleler, p,5)
foo = as.data.frame(sim1)[l, -(1:4)]
foo = gsub("2/1", "1/2", foo)
one.sim = table(foo)/Nsim
one[c(1,3,2)] - one.sim
magnusdv commented 6 years ago

Thanks. I agree - there is need for an easier way of extracting (sorted) genotypes. I will look into it.

For future reference here is a slightly cleaned-up version of this particular example:

library(pedtools)
library(pedprobr)
library(forrel)

z = cousinsPed(0, child=T)
child = 5
z = swapSex(z, child)

alleler = 1:2
Rfe = 0.05; Rma = 0.05 
Mfe = mutationMatrix(alleler, "eq", Rfe)
Mma = mutationMatrix(alleler, "eq", Rma)

m = marker(z, alleles = alleler, chrom = "X", 
           mutmat = list(female = Mfe, male = Mma))

one = oneMarkerDistribution(z, partial = m, ids = child, verbose = F)
one = one[c(1,3,2)] # reorder
one
#>       1/1       1/2       2/2 
#> 0.2910062 0.4179875 0.2910062

Nsim = 10000
sim1 = markerSim(z, N = Nsim, partial = m, available = child, verbose = F)

# Need an easier way of extracting (sorted) genotypes. This hack works, though:
foo = as.matrix(as.data.frame(sim1)[child, -(1:4)])
foo[foo == "2/1"] = "1/2"

one.sim = table(foo)/Nsim
one.sim
#> foo
#>    1/1    1/2    2/2 
#> 0.2902 0.4135 0.2963

# Numbers seem OK
one - one.sim
#>         1/1         1/2         2/2 
#>  0.00080625  0.00448750 -0.00529375

Created on 2018-08-12 by the reprex package (v0.2.0).

thoree commented 6 years ago

Fine!

Den søn. 12. aug. 2018 kl. 12:17 skrev Magnus Dehli Vigeland < notifications@github.com>:

Thanks. I agree - there is need for an easier way of extracting (sorted) genotypes.

For future reference here is a slightly cleaned-up version of this particular example:

library(pedtools) library(pedprobr) library(forrel) z = cousinsPed(0, child=T)child = 5z = swapSex(z, child) alleler = 1:2Rfe = 0.05; Rma = 0.05 Mfe = mutationMatrix(alleler, "eq", Rfe)Mma = mutationMatrix(alleler, "eq", Rma) m = marker(z, alleles = alleler, chrom = "X", mutmat = list(female = Mfe, male = Mma)) one = oneMarkerDistribution(z, partial = m, ids = child, verbose = F)one = one[c(1,3,2)] # reorderone#> 1/1 1/2 2/2 #> 0.2910062 0.4179875 0.2910062 Nsim = 10000sim1 = markerSim(z, N = Nsim, partial = m, available = child, verbose = F)

Should be an easier way of extracting (sorted) genotypes. This hack works, though:foo = as.matrix(as.data.frame(sim1)[child, -(1:4)])foo[foo == "2/1"] = "1/2"

one.sim = table(foo)/Nsimone.sim#> foo#> 1/1 1/2 2/2 #> 0.2902 0.4135 0.2963

Numbers seem OKone - one.sim#> 1/1 1/2 2/2 #> 0.00080625 0.00448750 -0.00529375

Created on 2018-08-12 by the reprex package http://reprex.tidyverse.org (v0.2.0).

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/magnusdv/forrel/issues/2#issuecomment-412332538, or mute the thread https://github.com/notifications/unsubscribe-auth/AKVGh3hDxyJKgzR5cmbkjnMBuL3xKI1Cks5uQADSgaJpZM4V5cBK .

magnusdv commented 5 years ago

This can now be done with pedtools::sortGenotypes() which was introduced in this commit: magnusdv/pedtools@24466e3.

At least for now, sorting is not incorporated in markerSim(), so if you need it you should sort after the simulation:

library(forrel, quiet = T)
x = singleton(1)
x = markerSim(x, N = 5, alleles = 1:3, seed = 123, verbose = F)
x
#>  id fid mid sex <1> <2> <3> <4> <5>
#>   1   *   *   1 2/1 3/1 1/2 3/1 3/3

# Note that markers 1, 2 and 4 have misordered genotypes

# Sort
y = sortGenotypes(x)
y
#>  id fid mid sex <1> <2> <3> <4> <5>
#>   1   *   *   1 1/2 1/3 1/2 1/3 3/3