crsh / papaja

papaja (Preparing APA Journal Articles) is an R package that provides document formats to produce complete APA manuscripts from RMarkdown-files (PDF and Word documents) and helper functions that facilitate reporting statistics, tables, and plots.
https://frederikaust.com/papaja_man/
Other
654 stars 133 forks source link

Allow customization of reporting style in apa_print() #506

Open crsh opened 2 years ago

crsh commented 2 years ago

This PR by @JeffreyRStevens has prompted me again to think about adding a layer of abstraction to apa_print() that allows customization of the reporting style. Jeffrey's usecase is to allow leading zeros for p values, but there are many other things that users might want to adjust (e.g., the number of digits, order of reported statistics, etc.).

I think if we decide to go down this path it would make sense to split the package further:

  1. A general purpose version of apa_print(), e.g. typeset() or report(), that accesses a set of global preferences that define the reporting style.
  2. A package specifically geared towards reporting in APA style (including apa_print()).

(I remember discussing something like the general purpose package with other developers at a repository that, I think, has since been moved to GitLab, but I can't seem to find it now.)

So, I think it's worth briefly thinking about how much flexibility we need and then see if we can find a nice way to implement that.

  1. Formatting of specific statistics (leading zeros, number of digits, etc.)
  2. Typesetting of statistics (I'm thinking of degrees of freedom for t-tests, e.g. $t_{15}$ rather than $t(15)$
  3. Order of statistics?

To be honest, I'm not very familiar with a lot of other style guides, so any input here would be greatly appreciated.

So what we would need is a set of arguments passed to print_num() for each statistic and the option to customize the lookup tables and glues we use to put together the colums names and reporting strings?

This seems doable. What else am I missing? @mariusbarth

JeffreyRStevens commented 2 years ago

Many thanks for considering this. I know it will be a lot of work.

I think statistical value formatting, statistical label typesetting, and order cover most of the flexibility. Would order include dropping statistics you may not want (e.g., confidence intervals)?

As for other style guides, I'm not sure many other publishers are as picky as APA--at least at the submission stage. One thing that I've seen is separating out degrees of freedom from the test statistic (e.g., t = 3.7, df = 6). Another is using ± for confidence intervals instead of the range. Of course, papaja doesn't need to cater to all possibilities, but these are some other styles.

Maybe this should be a separate issue, but I would like some additional flexibility in creating cutoffs, especially for Bayes factors > 1 (e.g., BF_{10} > 1,000). And also different numbers of digits for BFs > or < 1 (e.g., 10.2 vs. 0.098). The unique features of Bayes factors may require their own printbf() function. Let me know if you'd like me to submit this as a separate issue. I have some simple functions working on this.

jorgesinval commented 1 year ago

One suggestion — that I would like you to consider — is to add the possibility of inserting round parenthesis on confidence intervals reporting. Since in rigor, for a continuous DF they should be open (which usually can be presented with round parenthesis).