Closed emmanuelparadis closed 8 months ago
This should be solved. The new version (1.3-0.1) has a new method to find all possible MSTs by direct enumeration. This is the new default. The random method is still available with random = TRUE
:
R> library(pegas)
R> data(woodmouse)
R> d <- dist.dna(woodmouse, model = "N")
R> newrnt <- rmst(d)
R> oldrnt <- rmst(d, random = TRUE)
Iteration: 9 Number of links: 22
R> all.equal(newrnt, oldrnt)
## sometimes TRUE but not most of the times
As expected from my previous post, increasing the number of iterations gives the correct RMST:
R> oldrnt100 <- rmst(d, random = TRUE, B = 100)
Iteration: 100
R> all.equal(newrnt, oldrnt100)
[1] TRUE
And there's a (very) substantial gain in performance:
R> system.time(rmst(d))
utilisateur système écoulé
0.001 0.000 0.001
R> system.time(rmst(d, random = TRUE, B = 100, quiet = TRUE))
utilisateur système écoulé
0.052 0.000 0.052
Taking the Hamming distances from the woodmouse data:
I run
rmst()
onD
twice and compare both returned networks.The loop stops after, typically, between 1 and 3 iterations of the
for
loop. If the number of iterations ofrmst()
is fixed to 20:The loop stops after, typically, between 30 and 80 repeats. If
B <- 40
, then this works fine.Conclusion: the default stop criterion in
rmst
is too mild, at least for this dataset (see?rmst)
: