vivekjoshy / openskill.py

Multiplayer Rating System. No Friction.
https://openskill.me
MIT License
265 stars 13 forks source link

Software Paper Review: Suggestions for Clarity and Completeness #115

Closed Naeemkh closed 8 months ago

Naeemkh commented 11 months ago

Please consider the following in drafting the software manuscript:

This issue is related to this submission: https://github.com/openjournals/joss-reviews/issues/5901

vivekjoshy commented 11 months ago

Thank you. I will make these changes as soon as possible.

matt-graham commented 10 months ago

Further to @Naeemkh comments above, some additional comments and suggestions on the updated version of the paper in #116

Summary

The summary section needs to be made more suitable for a diverse and non-specialist audience with some context about problem. What do you mean specifically mean by 'Online ranking' and what specifically is a rank in this context? I would add a couple of sentences just to give some introduction to the problem being considered for readers who may have no prior knowledge of online gaming, for example

Online gaming communities will typically assign ranks to players based on the outcomes of the games they play, with higher ranking players expected to exhibit higher skill in games. These ranks are used when matching up players and teams for new games, with an aim of ensuring games remain competitive with not too large a disparity in player or team skills.

Ideally avoid short forms like 1v1 which will not necessarily be clear to all readers.

There are several vague, unreferenced or otherwise backed up claims that should either be better justified or made more specific:

'MMO' needs defining on first usage.

'Similar to TrueSkill' - @herbrich2006trueskill reference should probably come here at first mention. Ideally you should also give a very brief (one or two sentences) overview of what TrueSkill is for context.

'OpenSkill offers a pure Python implementation of their models' - from a brief read through, it seems the main contribution of Weng and Lin (2011) is a methodology / algorithm for approximating the Bayesian updates to player skill estimates given outcome data for a series of previously proposed probabilistic models for ranked data. I think it would be clearer therefore to say something like 'OpenSkill offers a pure Python implementation of their Bayesian approximation method for probabilistic models of ranked data'. or 'approximate Bayesian inference algorithm for estimating the parameters of probabilistic models of ranked data'.

'designed for asymmetric multi-faction multiplayer games' - I would say defining what is meant by asymmetric and multi-faction here would be helpful.

'However OpenSkill boasts several advantages over proprietary models like TrueSkill' - 'TrueSkill model' is not entirely clear here, as TrueSkill framework combines both a specific probabilistic model and an expectation propagation based approximate Bayesian inference algorithm for updating the model parameters. Likewise OpenSkill is an implementation of a specific algorithm for estimating the parameters of several different probabilistic models. Ideally you should make clear whether a claim is with regards to the models used, algorithm or specific implementation.

Benchmarks

This section should explicitly state what TrueSkill implementation is being compared to (from the code I believe this is Python trueskill package available on PyPI). It would also be worth at least mentioning (as already discussed above) that the differences in performance seen here may be partly down to the relative efficiency of the implementations.

'Using a dataset of overwatch matches and player info' - 'overwatch' should be capitalized to make clear it is a proper noun and either a citation to a URL explaining what it is included or a footnote / inline comment giving some context. 'info' should be written in full as 'information'

'Using a dataset of Overwatch [citation to a URL or document explaining what Overwatch is] and player information'

'predicts the same number of matches as TrueSkill' - I think something like 'gives a similar predictive performance to TrueSkill' would be better here.

Comparison to related packages

You currently do not directly mention any other commonly used software packages for online player rating estimation, or state how OpenSkill.py compares to them in the main paper (there is brief mention of openskill.js in Acknowledgements in updated paper). Given the acknowledgement in the project README that 'this project is originally based off the openskill.js package', this should also be made clear in the paper, along with some overview of the relative advantages of the implementation here (even if this is just being available in Python as opposed to JavaScript). Similarly some of the implementations in other languages references in the README could be mentioned in the paper, as could the Python trueskill package (https://trueskill.org/) and any other similar packages you are aware of.

References

Two instances of 'bayesian' need to be capitalised by adding braces {} in paper.bib.