Closed wleoncio closed 1 year ago
Using Exact_cond_midP_unspecific_ordering_rx2(t(fontanella_2008), "decreasing")
as an example, this is the gold standard profile, achieved when calc_prob.ExactCond_unspecific()
is called directly:
$by.self
self.time self.pct total.time total.pct
"test_statistic" 33.92 65.36 41.96 80.85
"calc_prob.ExactCond_unspecific" 3.34 6.44 3.46 6.67
"calc_Pvalue_4x2.ExactCond_unspecific" 3.10 5.97 51.88 99.96
"rbind" 2.62 5.05 3.22 6.20
"rowSums" 1.96 3.78 2.82 5.43
"matrix" 1.64 3.16 1.70 3.28
"c" 0.74 1.43 0.74 1.43
"==" 0.62 1.19 0.62 1.19
"%in%" 0.54 1.04 0.72 1.39
":" 0.52 1.00 0.52 1.00
"is.data.frame" 0.44 0.85 0.44 0.85
"^" 0.42 0.81 0.42 0.81
"/" 0.32 0.62 0.32 0.62
"+" 0.32 0.62 0.32 0.62
"prod" 0.30 0.58 0.30 0.58
"(" 0.26 0.50 0.26 0.50
"-" 0.22 0.42 0.22 0.42
">" 0.22 0.42 0.22 0.42
"*" 0.16 0.31 0.16 0.31
"is.array" 0.06 0.12 0.06 0.12
"is.atomic" 0.06 0.12 0.06 0.12
"<" 0.04 0.08 0.04 0.08
"length" 0.04 0.08 0.04 0.08
"Exact_cond_midP_unspecific_ordering_rx2" 0.02 0.04 51.90 100.00
"dimnames" 0.02 0.04 0.02 0.04
$by.total
total.time total.pct self.time self.pct
"Exact_cond_midP_unspecific_ordering_rx2" 51.90 100.00 0.02 0.04
"calc_Pvalue_4x2.ExactCond_unspecific" 51.88 99.96 3.10 5.97
"test_statistic" 41.96 80.85 33.92 65.36
"calc_prob.ExactCond_unspecific" 3.46 6.67 3.34 6.44
"rbind" 3.22 6.20 2.62 5.05
"rowSums" 2.82 5.43 1.96 3.78
"matrix" 1.70 3.28 1.64 3.16
"c" 0.74 1.43 0.74 1.43
"%in%" 0.72 1.39 0.54 1.04
"==" 0.62 1.19 0.62 1.19
":" 0.52 1.00 0.52 1.00
"is.data.frame" 0.44 0.85 0.44 0.85
"^" 0.42 0.81 0.42 0.81
"/" 0.32 0.62 0.32 0.62
"+" 0.32 0.62 0.32 0.62
"prod" 0.30 0.58 0.30 0.58
"(" 0.26 0.50 0.26 0.50
"-" 0.22 0.42 0.22 0.42
">" 0.22 0.42 0.22 0.42
"*" 0.16 0.31 0.16 0.31
"is.array" 0.06 0.12 0.06 0.12
"is.atomic" 0.06 0.12 0.06 0.12
"<" 0.04 0.08 0.04 0.08
"length" 0.04 0.08 0.04 0.08
"dimnames" 0.02 0.04 0.02 0.04
$sample.interval
[1] 0.02
$sampling.time
[1] 51.9
As a matter of fact, we can measure the overhead of convertFunName2Method()
by adding a breakpoint on calc_Pvalue_4x2.ExactCond_unspecific()
and running the following:
Browse[1]> microbenchmark::microbenchmark(
"method" = calc_prob.ExactCond_unspecific(x[, 1], 4, N_choose_np1, nip_choose_xi1),
"generic" = calc_prob(x[, 1], 4, N_choose_np1, nip_choose_xi1)
)
Unit: microseconds
expr min lq mean median uq max neval cld
method 4.017 4.3655 4.84964 4.5275 4.776 16.109 100 a
generic 76.810 78.1670 81.29589 80.0300 81.728 165.326 100 b
So the dispatching adds a horrifying 20x time penalty to calc_prob()
in this example.
Functions involved in the convertFunName2Method()
dispatching taking most total time:
$by.total
total.time total.pct self.time self.pct
"convertFunName2Method" 45.20 44.38 4.02 3.95
"gsub" 29.14 28.61 9.58 9.41
"is.factor" 19.54 19.19 0.52 0.51
"as.character" 17.68 17.36 17.68 17.36
"findInCallstack" 11.76 11.55 6.08 5.97
"match" 4.94 4.85 4.94 4.85
So these should be the first targets for further optimization.
Looks like the callstack can get really long, which makes dispatching using this function very slow. One idea would be to replace the if-else statements with
switch()
. Perhaps simplifying thecallstack
object (from a list to a vector, for example) also helps.Example: