Closed alexpghayes closed 4 years ago
Update for anyone following along: the new plan is to document output columns names in glance.yaml
, tidy.yaml
, etc. These YAML files will then get translated into a tibble column_glossary
that we will export as a data object. We will then use roxygen templates to pull the relevant columns from the column_glossary
object and populate the documentation. I've started work on this in the column_glossary
branch on my fork.
Hadley has suggested using the evalRd
tag instead. Will look into that soon.
I'm now convinced that @evalRd
is the wrong approach. Consider
#' @evalRd return_tidy("component", regression = TRUE)
#'
#' @details The tibble has one row for each term in the regression. The
#' `component` column indicates whether a particular
#' term was used to model either the `"mean"` or `"precision"`. Here the
#' precision is the inverse of the variance, often referred to as `phi`.
#' At least one term will have been used to model the precision `phi`.
tidy.ivreg <- ...
This (on my local branch) produces documentation for the standard regression terms, and also for the "component"
column. The issue is that tidy.kmeans
also uses the component
column, and describes it as: Cluster id as a factor. For a model ‘k' clusters, these will be 'as.factor(1:k)', or 'as.factor(0:k)' if there’s a noise term.
The results of the @evalRd
call don't show up at all if the documentation has any @return
tag, so the options here are:
@evalRd
. This is an unrealistically large task at the moment.component
column in @details
or elsewhere. I'm opposed to this.So I don't think @evalRd
is flexible enough to standardize the @return
documentation.
The next question is if the @return
documentation can be dealt via @template
. The simplest and most work intensive solution is have templates for each potential column to be returned, and add these in via like:
#' @template return_aic
#' @template return_bic
I don't like this because I'd prefer that the column descriptions all live in the same file.
The next option is use @templateVar
to specify the columns to generate documentation for. This would result in some really long lines on occasion to specify all the necessary columns.
Another template based option is tracking which tidying methods return which columns in the column_glossary
tibble, and just pass the tidier class to a template, which can then pull all the relevant information from the column_glossary
. The tidyverse team and Hadley in particular were strongly opposed to this when I talked to them about it on Monday. Additional, I'm convinced that the column_glossary
should move to modeltests
, in which case it shouldn't have broom specific information.
At this point, it doesn't seem like there's any especially great option. That said, the current documentation is a mess and I don't like the idea of leaving it as is either.
How about named arguments in the ... of return_tidy
can define additional columns and their documentation?
#' @evalRd return_tidy("component", regression = TRUE,
#' p.value = "The p-value",
#' AIC = "blablabla")
This would also give the option of overriding one from e.g. the regression group.
This has been addressed, as far as I'm aware!
This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.
Many of the tidiers return the same type of information about different model types. For example, a bajillion
glance
methods return a tibble with anAIC
column.Ideally, I'd like to document what the
AIC
column represents in a single place, so that all those glance methods benefit from the same documentation and there's no need to copy-paste any changes to the description.Currently, most methods document the return value with something like:
I'd like to replace this with something along the lines of
where somewhere in the boilerplate describes how to use
filter(column_glossary, used_by == "betareg")
to get see the descriptions of the return columns.@dgrtwo thoughts?