epiforecasts / scoringutils

Utilities for Scoring and Assessing Predictions
https://epiforecasts.io/scoringutils/
Other
48 stars 21 forks source link

scoringutils 0.1.7.2 review #121

Closed Bisaloo closed 2 years ago

Bisaloo commented 3 years ago

Preface

This is an informal review conducted by a lab member. To ensure maximal objectivity, the rOpenSci review template is used. This template also guarantees that this package is following the most up-to-date and strictest standards available in the R community.

The template is released under CC-BY-NC-SA and this review is therefore published under the same license.

The review was finished on 2021-07-27 and concerns the version 0.1.7.2 of scoringutils (commit de45fb7d78e1a334fe9e24cf3435f648c82a3cb8).

Package Review

Documentation

The package includes all the following forms of documentation:

The scoringutils package provides a collection of metrics and proper scoring rules that make it simple to score forecasts against the true observed values.

Functionality

Estimated hours spent reviewing: 13h


Review Comments / Code Review

https://github.com/epiforecasts/scoringutils/blob/de45fb7d78e1a334fe9e24cf3435f648c82a3cb8/R/bias.R#L86-L90

could be simplified as

continuous_predictions <- !all.equal(as.vector(predictions), as.integer(predictions))

Some other occurrences:

https://github.com/epiforecasts/scoringutils/blob/de45fb7d78e1a334fe9e24cf3435f648c82a3cb8/R/eval_forecasts_continuous_integer.R#L52

https://github.com/epiforecasts/scoringutils/blob/de45fb7d78e1a334fe9e24cf3435f648c82a3cb8/R/pit.R#L157

[ ] https://github.com/epiforecasts/scoringutils/blob/de45fb7d78e1a334fe9e24cf3435f648c82a3cb8/R/bias.R#L95-L99

[ ] https://github.com/epiforecasts/scoringutils/blob/de45fb7d78e1a334fe9e24cf3435f648c82a3cb8/R/bias.R#L106-L110

[ ] https://github.com/epiforecasts/scoringutils/blob/de45fb7d78e1a334fe9e24cf3435f648c82a3cb8/R/pit.R#L169-L173

[ ] https://github.com/epiforecasts/scoringutils/blob/de45fb7d78e1a334fe9e24cf3435f648c82a3cb8/R/pit.R#L193-L197

predictions <- matrix(rnorm(100), ncol = 10, nrow = 10)
true_values <- rnorm(10)

microbenchmark::microbenchmark(
  "vapply" = { vapply(seq_along(true_values), function(i) sum(predictions[i, ] <= true_values[i]), numeric(1)) },
  "vector" = rowSums(predictions <= true_values),
  check = "identical"
)

## Unit: microseconds
##    expr    min      lq     mean  median     uq    max neval
##  vapply 15.492 16.4605 17.74523 16.8215 17.302 47.741   100
##  vector  5.161  5.7475  6.24385  5.9215  6.089 34.897   100

https://github.com/epiforecasts/scoringutils/blob/de45fb7d78e1a334fe9e24cf3435f648c82a3cb8/R/utils.R#L208

https://github.com/epiforecasts/scoringutils/blob/de45fb7d78e1a334fe9e24cf3435f648c82a3cb8/R/utils.R#L220

https://github.com/epiforecasts/scoringutils/blob/de45fb7d78e1a334fe9e24cf3435f648c82a3cb8/R/utils_data_handling.R#L422-L430

Instead, you should use:

comparison_mode <- match.arg(comparison_model)

...

if (comparison_mode == "ratio") { ... }

https://github.com/epiforecasts/scoringutils/blob/de45fb7d78e1a334fe9e24cf3435f648c82a3cb8/tests/testthat/test-bias.R#L48

Conclusion

This is overall a solid package that could become a widely used tool in forecast sciences. I could not see any bugs in the code and the performance looks very good on the examples I ran. The package interface is clever and can surely prove useful to a large array of users thanks to the two levels of functions (low-level scoring functions vs all-in-one eval_forecasts()).

Two points could slow down / reduce adoption and these should be fixed for this package to reach its full potential and attract as many users as possible:

seabbs commented 3 years ago

This is amazing and very useful.

These points about reducing interface complexity seem spot on:

  • drop the verbose argument. Either the diagnostic messages are unnecessary and should be dropped entirely or they are useful and should be printed. If users really don’t want to see messages/warnings, they can use base functions > suppressMessages() / suppressWarnings(). This could also be controlled by a global switch in options() like usethis is > doing.
    • plotting side effects could be removed. The primary goal of eval_forecasts() is to return a data.frame with the scores. Users that want the plot could call another function afterwards. This would allow the removal of the pit_plots argument.
    • remove the possibility of having either a single data or forecasts, truth_data and merge_by as inputs. get rid of summarised and act as if summarised = TRUE if by != summarise_by (is this Check whether we can get rid of the summarised = TRUE argument? #106?)
nikosbosse commented 2 years ago

Bonjour!

A few questions:

recommended using the fct() when talking about functions. It is then easier to make the difference between functions and other objects and it enables auto-linking to the function documentation in the pkgdown website.

  • does that work in the vignette as well?

there is a minor issue with the equation rendering in the pkgdown website (e.g., https://epiforecasts.io/scoringutils/reference/abs_error.html). The solution is probably to pass both a LaTeX/mathjax and a ASCII version of the equation to \deqn{}.

  • how can I do that?

As mentioned in another discussion, there is some inconsistency in the use of data.table and modifying in place vs copying. Beyond the stylistic issue, this is a possible source of bugs so I’d recommend sticking to one or the other.

  • not entirely sure what to do here

Thank you very much!

seabbs commented 2 years ago

I think you can't have examples for non-exported functions? Or at least you can't without some workaround. I would say no anyway.

Yes

Will the paper be kept updated for ever? I would probably be in favour of having the paper content spread across multiple vignettes as I imagine quite long. That way will be more fluid and easy to update. If the vignette is the same as the readme I would probably drop the vignette or focus move most of the content into the vignette and just keep a small quick start in the readme.

no idea on the equation issue (@bisaloo probably knows).

I would just use data.table or dplyr. I doon't think its a major issue though.

The preview you gave today looked great by the way.

seabbs commented 2 years ago

Is this closable or worth going back over?

nikosbosse commented 2 years ago

I think we can close it. Testing remains an issue, but that has its own issue :)

nikosbosse commented 2 years ago

Thank you again @Bisaloo, this was amazing!

seabbs commented 2 years ago

Might be best to let @Bisaloo close as part of the review process?

Bisaloo commented 2 years ago

Yep, I think the major points have been moved to separate issues. Let's continue the discussion there.