r-devel / r-dev-day

Repo to organize tasks for R Dev Days
7 stars 2 forks source link

Bug 17279: parameter “amount” is badly explained for `jitter()` #39

Open shannonpileggi opened 3 months ago

shannonpileggi commented 3 months ago

On Bugzilla: Bug 17279- parameter "amount" is badly explained for jitter()

First submitted comment below, see Bugzilla for more discussion.

In the manual page for jitter(), the description of parameter "amount" makes a forward reference to a parameter "z" that is explained later ("if ‘= 0’ the default is ‘factor * z/50’").

In general, avoid forward references in manual pages; if unavoidable, mention 
where the reference will be explained.

In the "Description" section the parameter "z" is explained, and another parameter "d" is introduced that only used for parameter "amount".

In general if a variable is "local" to some parameter, explain that variable there. ("we set ‘a <- factor * d/5’ where _d_ is the smallest difference between adjacent unique (apart from fuzz) ‘x’ values.")

Therefore I suggest to move descriptions of "z" and "d" into the description of parameter "amount". The positive side-effect will be that the description of jitter() becomes very short (and "crispy").

In general putting the description in front of the parameters also seems like a good idea to me, because why should you read a long description of parameters when you want to find out what a function does (maybe that is not the function you are looking for)?

It may not be an argument, but in UNIX manual pages "OPTIONS" always follow "DESCRIPTION", probably for a good reason.
collinberke commented 3 months ago

Suggested change to src/library/base/man/jitter.Rd:

\arguments{
  \item{x}{numeric vector to which \emph{jitter} should be added.}
  \item{factor}{numeric vector of length one that scales the amount of jitter when \code{amount} is \code{NULL} or \code{0}.}
  \item{amount}{\code{NULL} (\emph{default}) or numeric vector that specifies the amount of jitter. Passing \code{NULL} or \code{0} to \code{amount} selects a method for calculating the \code{amount}. See Details for more information on methods used.}
}
\description{
  Add a small amount of noise to a numeric vector.
}
\value{
  \code{jitter(x, \dots)} returns a numeric of the same length as
  \code{x}, but with an \code{amount} of noise added in order to break
  ties.
}
\details{
  The result, say \code{r}, is \code{r <- x + runif(n, -a, a)}
  where \code{n <- length(x)} and \code{a} is the \code{amount}
  argument (if specified).

  Let \code{z <- max(x) - min(x)} (assuming the usual case). The amount \code{a} to be added is either provided as \emph{positive}
  argument \code{amount} or otherwise computed from \code{z}, as
  follows:

  If \code{amount} is \code{NULL} (\emph{default}), \code{factor * d / 5} is used, where \code{d} is the smallest difference between adjacent unique (apart from fuzz) \code{x} values. 

  If \code{amount == 0}, \code{a} is set using \code{factor * z / 50}.
}
hturner commented 3 months ago

Thanks @collinberke. It would be great if you could prepare a patch using the GitHub mirror of the R sources, as described here: https://contributor.r-project.org/rdevguide/FixBug.html#using-a-git-mirror. This will make it easier to see the changes that you are proposing.

mmaechler commented 3 months ago

.. and I hope you still keep my https://bugs.r-project.org/show_bug.cgi?id=17279#c3 in mind!

collinberke commented 3 months ago

@hturner, I can do that. I was having trouble building the dev environment in our session last week, so I just dumped what we got done here. I'll work to get these changes into Bugzilla following your guidelines.

@mmaechler, certainly trying to keep your input in mind. New contributor to R Core, so I'll try my best. I'm open to any changes, suggestions, or rejections, though. This was tricky to explain.

hturner commented 3 weeks ago

@collinberke sending a reminder about this. If it helps to set aside some time you might join us at the R Dev Day @ LatinR!