Open mpadge opened 4 years ago
Thanks @mpadge, this is a fantastic write-up and is exactly the type of feedback that helps us to make the package more accessible. Your thorough overview is extremely valuable.
As is evident when looking back at the vignettes and is highlighted by your comments, this package has gone through quite a few structural rearrangements and the language in the vignettes has suffered. I agree with all the suggestions and I think we can break these up into tasks that we can start working through. I think we're reaching the point where we're confident enough with the architecture that we can really invest in making sure everything is clear in our onboarding process.
@elong0527 - how would you feel about organizing a documentation themed sprint for December? that might be a good way of toning down the implementation pieces around the holidays while still moving forward.
Thanks @mpadge for the great review. Yeah, @dgkf, I think it is a good idea to focus on documentation in December.
First impression: Fanstastic work @elong0527 @kkmann @dgkf - you really have given so much thought to the package API, and done a tremendous job of thinking the whole time about future extensibility. And yes, that makes initial familiarization with the package somewhat more burdensome than it might otherwise be, but the extensibility will be critical to future success and adoption of the package, so it is very clearly worthwhile.
So as @dgkf already knows, rOpenSci are embarking upon a new endeavour to develop a system to peer-review explicitly statistical software packages. Among many other necessities will be the development of tools to assess the "quality" of software (whatever that might be deemed to be). Neither we nor you want to expend effort repeating prior development efforts of other people, and so ... i'm pretty confident at this very early stage that i can say that we will build at least part of the system directly on
riskmetric
. To that end, I have devoted quite some time familiarizing myself with the package, and so offer, in the tradition of rOpenSci reviews, the following observations. Please note that I have done nothing but engage in the easy job of armchair criticism here. Accordingly, please do not interpret any of the following as implying any degree of overt criticism of the package as a whole, which really is very impressive. The observations are merely made in the hope that addressing them may serve to improve the package even more.I'll focus here only on the meta-level of package functionality as presented in the vignettes, and so divide this in to the corresponding sections.
Main vignette
The main vignette jobs a pretty good job of explaining the principles on which the package is built. These principles, and particularly the general
pkg_ref
->assess
->score
workflow is utterly necessary to understanding how the package works, so I'd suggest it is critical to at least briefly explain these steps, and their individual and collective importance, in the README, rather than the current state of explaining them only in the vignette. Most users will want and expect to be able to at least get started through just scanning the README, yet this is not currently possible. At the very least, you could have a disclaimer in the README clarifying that it is necessary to read the main vignette in order to know how to use the package.No criticism here, but these three core steps really are (in my opinion at least) a brilliant way of accommodating flexibility at all crucial stages of specifying both what and how metrics are to be assessed and scored. This truly is a lovely implementation - painless, intuitive, and extensible.
The "core verbs" section serves no real purpose as is, because it does not explain what they are, and so leaves a reader doing little other than wondering what they are supposed to do with that section. I'm not sure whether you mean to imply that the core verbs are:
"assess"
,"score"
, and"summarise"
, or something else? The link to the Extending vignette does no help to clarify at that juncture.Under
"assess"
, I'd suggest breaking the first bit prior to, "For example ... NEWS files ...", and clarify that available assessments can be seen with theall_assessments
function, and then have a dump of the output of that in the vignette itself. The text of the "Adding and Assessment" bit of the Extending vignette could very well live in the first vignette, where it would greatly help to understand the importance of Assessments at the outset.It might help to explain in this first vignette the two primary classes of objects produced by the package:
pkg_ref
andpkg_metric
objects, and explain how these are related to thepkg_ref
->assess
->score
workflow. It would also seem to me to be necessary to explicitly describe the relationship between the output ofassess
and apkg_metric
object, because this is currently not explained anywhere, yet understanding that relationship would be of great help in understanding the overall package.In fact, it would probably help a lot to initially focus on a
pkg_ref
->pkg_metric
workflow (which works perfectly, even though one would not suspect that from?pkg_metric
, which is currently not particularly helpful). Once i understand that, then understandingpkg_ref
->assess
becomes much easier, becauseassess
just yields a meta-collection ofpkg_metric
objects. (Thepkg_metric()
function is in fact really useful at prying in under the hood here, so i really would suggest bringing that more to the (documentation) fore. Current description:... is not really accurate at all. It actually generates a
pkg_metric
object as part of theassess
function.)Extending riskmetric
I think it would be useful at the outset of this vignette to explain the four classes of
pkg_ref
object ("pkg_remote", "pkg_install", "pkg_source", "pkg_missing"
) (or maybe just the first 3, i guess?). Then just explain that the Assessment methods are dispatched for each class ofpkg_ref
object. As is, you say that,but i've no idea at that point what "the pkg_install functions" means, so - in the context of the text as currently written - you've lost me from that point on.
You say that from the
"assess"
function,but this function actually just returns a tibble with no special class attributes, right? Some but not all of the items of that tibble are
pkg_metric
objects (... and see the above comments aboutpkg_ref
andpkg_metric
objects...)The need to jump straight into caching is a bit confusing. It would seem necessary to explicitly explain the the entire package only works via a caching system. You currently have
but i am also free to read it at that point and think, "No, i don't want to cache it; i just want to try to write a first assessment function without this pesky caching, and can always add that later, right?" - Nope, not right, because caching is essential and integral to package functioning. This should be explained simply and mechanistically at this point, along the lines of "all assessment functions work by caching .." and describe simply but precisely how calls to
pkg_ref
rely onpkg_ref_cache
. (The section itself is called "Example Metadata Caching", which is further unhelpful in that context but connoting that it really is not necessary, yet it is actually absolutely essential here.)General comments
I'll just offer one initially, regarding function names, the main ones of which are arguably somewhat overly generic:
assess
,score
, andcoverage
will almost certainly meet namespace conflicts at some stage (and yes,coverage
directly calls thecovr
function of the same name, but loading this package will still issue a namespace conflict warning, which is an undesirable side-effect). As recommended in the rOpenSci guide, it would perhaps be better to simply prefix these withrm_
(sorm_assess
,rm_score
, andrm_coverage
). Note, however, thatrm_
is of course used as a common prefix for remove-type functions, although even then mostly only in theqdapRegex
package, so still largely free otherwise. Maybeprm_
for package risk metric?The package seems still "young" enough that this could be done with little consequence. Just a thought ...
Happy to open any of these as individual issues if that would help focus work addressing any of these - just let me know. I'm really looking forward to using, breaking, and extending this fabulous package. Already much indebted to you!
mark
ping @noamross @karthik