mlr-org / mlr3benchmark

Analysis and tools for benchmarking in mlr3 and beyond.
https://mlr3benchmark.mlr-org.com/
GNU Lesser General Public License v3.0
12 stars 2 forks source link

refatoring -- ws 2023 discussion with john, seb, bb #37

Open berndbischl opened 1 year ago

berndbischl commented 1 year ago

bmr = benchmark(....)

bma = as_benchmark_aggr(bmr) bms = as_benchmark_score(bmr) bml = as_benchmark_loss(bmr)

friedman / blme

autoplot(bma) ---> plots without tests: mean, box

infer_friedman_global(bma, measure) --> s3: (friedman_global, infer)

infer_friedman_posthoc(bma, measure, global = T/F) --> s3: (friedman_posthoc, infer)

autoplot.friedmanpostdoc(ifp, type = c("cd", "fn")) ----------> does CD plot comment: we keep the posthoc-matrix plot only keep one CD plot, MAYBE have some MILD, SIMPLE args to configure its style

infer_blme: similar as above....!


we probably want some form of standardization if we have multiple tasks

berndbischl commented 1 year ago

containers support multiple measures plots and tests support 1 measure. if none is passed we use the first one in the container

berndbischl commented 1 year ago

infer_blme(bms) --> (blme, infer) autoplot(iblme, type = "ridgeline", "interval", "halfeye")

berndbischl commented 1 year ago

maybe add (exposed) helper that draws from the posterior of the BLME