Consider RegressionTests.jl and Chairmarks.jl for benchmarking

dmbates commented 4 months ago

Initially this branch just provides a bench/runbenchmarks.jl that uses Chairmarks.jl
The method is to construct a table of dataset names, formulas and number of seconds to run the benchmark for that combination. This design is preliminary.
Eventually we may consider using RegressionTests.jl in CI but that package seems best suited to micro-benchmarks.

dmbates commented 4 months ago

On an M1 Macbook Pro the results were

Table with 3 columns and 19 rows:
      dsnm        secs  frm
    ┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 1  │ dyestuff2   0.1   yield ~ 1 + :(1 | batch)
 2  │ dyestuff    0.1   yield ~ 1 + :(1 | batch)
 3  │ machines    0.1   score ~ 1 + :(1 | Worker) + :(1 | Machine)
 4  │ pastes      0.1   strength ~ 1 + :(1 | batch & cask)
 5  │ pastes      0.1   strength ~ 1 + :(1 | batch / cask)
 6  │ penicillin  0.1   diameter ~ 1 + :(1 | plate) + :(1 | sample)
 7  │ sleepstudy  0.1   reaction ~ 1 + days + :(1 | subj)
 8  │ sleepstudy  0.1   reaction ~ 1 + days + :(zerocorr((1 + days) | subj))
 9  │ sleepstudy  0.1   reaction ~ 1 + days + :(1 | subj) + :((0 + days) | subj)
 10 │ sleepstudy  0.1   reaction ~ 1 + days + :((1 + days) | subj)
 11 │ kb07        0.1   :(log(rt_trunc)) ~ 1 + spkr + prec + load + :(1 | subj) + :(1 | item)
 12 │ kb07        0.1   :(log(rt_trunc)) ~ 1 + spkr + prec + load + spkr & prec + spkr & load + prec & load + spkr & prec & load + :(1 | …
 13 │ mrk17_exp1  1.0   :(1000 / rt) ~ 1 + F + P + Q + lQ + lT + F & P + F & Q + P & Q + F & lQ + P & lQ + Q & lQ + F & lT + P & lT + Q &…
 14 │ insteval    5.0   y ~ 1 + service + dept + service & dept + :(1 | s) + :(1 | d)
 15 │ insteval    5.0   y ~ 1 + service + :(1 | s) + :(1 | d) + :(1 | dept)
 16 │ kb07        5.0   :(log(rt_trunc)) ~ 1 + spkr + prec + load + spkr & prec + spkr & load + prec & load + spkr & prec & load + :((1 +…
 17 │ mrk17_exp1  25.0  :(1000 / rt) ~ 1 + F + P + Q + lQ + lT + F & P + F & Q + P & Q + F & lQ + P & lQ + Q & lQ + F & lT + P & lT + Q &…
 18 │ d3          25.0  y ~ 1 + u + :((1 + u) | g) + :((1 + u) | h) + :((1 + u) | i)
 19 │ ml1m        25.0  y ~ 1 + :(1 | g) + :(1 | h)

julia> res = runbmrk(tbl)
Table with 3 columns and 19 rows:
      dsnm        bmk                                                          frm
    ┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 1  │ dyestuff2   Sample(time=8.2583e-5, allocs=1039, bytes=53040)             yield ~ 1 + :(1 | batch)
 2  │ dyestuff    Sample(time=8.3209e-5, allocs=1044, bytes=53136)             yield ~ 1 + :(1 | batch)
 3  │ machines    Sample(time=0.000216375, allocs=2029, bytes=102688)          score ~ 1 + :(1 | Worker) + :(1 | Machine)
 4  │ pastes      Sample(time=0.000117083, allocs=1477, bytes=91336)           strength ~ 1 + :(1 | batch & cask)
 5  │ pastes      Sample(time=0.000245833, allocs=2391, bytes=130592)          strength ~ 1 + :(1 | batch / cask)
 6  │ penicillin  Sample(time=0.000349334, allocs=2875, bytes=155280)          diameter ~ 1 + :(1 | plate) + :(1 | sample)
 7  │ sleepstudy  Sample(time=0.000109375, allocs=1187, bytes=95368)           reaction ~ 1 + days + :(1 | subj)
 8  │ sleepstudy  Sample(time=0.000215125, allocs=1753, bytes=128544)          reaction ~ 1 + days + :(zerocorr((1 + days) | subj))
 9  │ sleepstudy  Sample(time=0.000246875, allocs=2044, bytes=170896)          reaction ~ 1 + days + :(1 | subj) + :((0 + days) | subj)
 10 │ sleepstudy  Sample(time=0.000618667, allocs=2490, bytes=142272)          reaction ~ 1 + days + :((1 + days) | subj)
 11 │ kb07        Sample(time=0.00159417, allocs=12691, bytes=1293768)         :(log(rt_trunc)) ~ 1 + spkr + prec + load + :(1 | subj) + …
 12 │ kb07        Sample(time=0.00517542, allocs=15926, bytes=2628872)         :(log(rt_trunc)) ~ 1 + spkr + prec + load + spkr & prec + …
 13 │ mrk17_exp1  Sample(time=0.0700094, allocs=161656, bytes=118467680, gc_…  :(1000 / rt) ~ 1 + F + P + Q + lQ + lT + F & P + F & Q + P…
 14 │ insteval    Sample(time=0.297278, allocs=299482, bytes=141469552, gc_f…  y ~ 1 + service + dept + service & dept + :(1 | s) + :(1 |…
 15 │ insteval    Sample(time=0.579619, allocs=303283, bytes=53943424)         y ~ 1 + service + :(1 | s) + :(1 | d) + :(1 | dept)
 16 │ kb07        Sample(time=0.146926, allocs=57548, bytes=5561244)           :(log(rt_trunc)) ~ 1 + spkr + prec + load + spkr & prec + …
 17 │ mrk17_exp1  Sample(time=4.40508, allocs=263701, bytes=153506900, gc_fr…  :(1000 / rt) ~ 1 + F + P + Q + lQ + lT + F & P + F & Q + P…
 18 │ d3          Sample(time=4.36566, allocs=732859, bytes=165737656, gc_fr…  y ~ 1 + u + :((1 + u) | g) + :((1 + u) | h) + :((1 + u) | …
 19 │ ml1m        Sample(time=11.3488, allocs=2004602, bytes=430842888, gc_f…  y ~ 1 + :(1 | g) + :(1 | h)

If you print the second column with MIME("text/plain") you get the compact version.

julia> res.bmk
19-element Vector{Chairmarks.Sample}:
 82.583 μs (1039 allocs: 51.797 KiB)
 83.209 μs (1044 allocs: 51.891 KiB)
 216.375 μs (2029 allocs: 100.281 KiB)
 117.083 μs (1477 allocs: 89.195 KiB)
 245.833 μs (2391 allocs: 127.531 KiB)
 349.334 μs (2875 allocs: 151.641 KiB)
 109.375 μs (1187 allocs: 93.133 KiB)
 215.125 μs (1753 allocs: 125.531 KiB)
 246.875 μs (2044 allocs: 166.891 KiB)
 618.667 μs (2490 allocs: 138.938 KiB)
 1.594 ms (12691 allocs: 1.234 MiB)
 5.175 ms (15926 allocs: 2.507 MiB)
 70.009 ms (161656 allocs: 112.980 MiB, 1.94% gc time)
 297.278 ms (299482 allocs: 134.916 MiB, 0.54% gc time)
 579.619 ms (303283 allocs: 51.444 MiB)
 146.926 ms (57548 allocs: 5.304 MiB)
 4.405 s (263701 allocs: 146.396 MiB, 0.08% gc time)
 4.366 s (732859 allocs: 158.060 MiB, 0.04% gc time)
 11.349 s (2004602 allocs: 410.884 MiB, 0.11% gc time)

I'm not sure how to get that version in the display of the table.

codecov[bot] commented 4 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 96.93%. Comparing base (34899cf) to head (c90f3ab).

Additional details and impacted files

```diff @@ Coverage Diff @@ ## main #753 +/- ## ======================================= Coverage 96.93% 96.93% ======================================= Files 34 34 Lines 3358 3358 ======================================= Hits 3255 3255 Misses 103 103 ``` | [Flag](https://app.codecov.io/gh/JuliaStats/MixedModels.jl/pull/753/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=JuliaStats) | Coverage Δ | | |---|---|---| | [current](https://app.codecov.io/gh/JuliaStats/MixedModels.jl/pull/753/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=JuliaStats) | `96.87% <ø> (ø)` | | | [minimum](https://app.codecov.io/gh/JuliaStats/MixedModels.jl/pull/753/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=JuliaStats) | `96.83% <ø> (ø)` | | | [nightly](https://app.codecov.io/gh/JuliaStats/MixedModels.jl/pull/753/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=JuliaStats) | `96.43% <ø> (ø)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=JuliaStats#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

dmbates commented 4 months ago

Another data point

julia> res = runbmrk(tbl)
Table with 3 columns and 19 rows:
      bmk                                                                             dsnm        frm
    ┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 1  │ Sample(time=9.4062e-5, allocs=1029, bytes=52176)                                dyestuff2   yield ~ 1 + :(1 | batch)
 2  │ Sample(time=9.5404e-5, allocs=1034, bytes=52272)                                dyestuff    yield ~ 1 + :(1 | batch)
 3  │ Sample(time=0.000273862, allocs=2165, bytes=104800)                             machines    score ~ 1 + :(1 | Worker) + :(1 | Machine)
 4  │ Sample(time=0.000146979, allocs=1466, bytes=90456)                              pastes      strength ~ 1 + :(1 | batch & cask)
 5  │ Sample(time=0.000284636, allocs=2371, bytes=128864)                             pastes      strength ~ 1 + :(1 | batch / cask)
 6  │ Sample(time=0.000392762, allocs=2855, bytes=153424)                             penicillin  diameter ~ 1 + :(1 | plate) + :(1 | sample)
 7  │ Sample(time=0.000123226, allocs=1151, bytes=93560)                              sleepstudy  reaction ~ 1 + days + :(1 | subj)
 8  │ Sample(time=0.000373249, allocs=1743, bytes=127104)                             sleepstudy  reaction ~ 1 + days + :(zerocorr((1 + days) | subj))
 9  │ Sample(time=0.000417766, allocs=2033, bytes=169472)                             sleepstudy  reaction ~ 1 + days + :(1 | subj) + :((0 + days) | subj)
 10 │ Sample(time=0.00109645, allocs=2480, bytes=140832)                              sleepstudy  reaction ~ 1 + days + :((1 + days) | subj)
 11 │ Sample(time=0.00163458, allocs=12461, bytes=1281736)                            kb07        :(log(rt_trunc)) ~ 1 + spkr + prec + load + :(1 | subj) + :(1 | item)
 12 │ Sample(time=0.00758549, allocs=16064, bytes=2632632)                            kb07        :(log(rt_trunc)) ~ 1 + spkr + prec + load + spkr & prec + spkr & load + prec & load + spkr & prec & load + :(1 | subj) + :((1 + prec) | item)
 13 │ Sample(time=0.0767897, allocs=159445, bytes=118266464, gc_fraction=0.00927004)  mrk17_exp1  :(1000 / rt) ~ 1 + F + P + Q + lQ + lT + F & P + F & Q + P & Q + F & lQ + P & lQ + Q & lQ + F & lT + P & lT + Q & lT + lQ & lT + F & P & Q + F & P & lQ + F & Q & lQ + P & Q &…
 14 │ Sample(time=0.389135, allocs=299479, bytes=141466336, gc_fraction=0.00219119)   insteval    y ~ 1 + service + dept + service & dept + :(1 | s) + :(1 | d)
 15 │ Sample(time=0.797134, allocs=303602, bytes=53942976)                            insteval    y ~ 1 + service + :(1 | s) + :(1 | d) + :(1 | dept)
 16 │ Sample(time=0.188412, allocs=52410, bytes=5451148)                              kb07        :(log(rt_trunc)) ~ 1 + spkr + prec + load + spkr & prec + spkr & load + prec & load + spkr & prec & load + :((1 + spkr + prec + load) | subj) + :((1 + spkr + prec + load) | i…
 17 │ Sample(time=5.92026, allocs=281198, bytes=153694324, gc_fraction=0.000424185)   mrk17_exp1  :(1000 / rt) ~ 1 + F + P + Q + lQ + lT + F & P + F & Q + P & Q + F & lQ + P & lQ + Q & lQ + F & lT + P & lT + Q & lT + lQ & lT + F & P & Q + F & P & lQ + F & Q & lQ + P & Q &…
 18 │ Sample(time=13.6477, allocs=733319, bytes=165914536, gc_fraction=0.000517121)   d3          y ~ 1 + u + :((1 + u) | g) + :((1 + u) | h) + :((1 + u) | i)
 19 │ Sample(time=19.5966, allocs=2004582, bytes=430835656, gc_fraction=0.00134264)   ml1m        y ~ 1 + :(1 | g) + :(1 | h)

julia> versioninfo()
Julia Version 1.10.2
Commit bd47eca2c8a (2024-03-01 10:14 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 8 × 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, tigerlake)
Threads: 8 default, 0 interactive, 4 GC (on 8 virtual cores)

dmbates commented 3 weeks ago

It turns out that using RegressionTests.jl and @track to check for changes in benchmark runs takes a very long time and I don't think it is worth the cost. It is not terribly interesting to determine if there is a small decrease in time to fit a simple model and the methodology of RegressionTests.jl is not well-suited to comparing fitting speed on complex models.

JuliaStats / MixedModels.jl

Consider RegressionTests.jl and Chairmarks.jl for benchmarking #753

Codecov Report