tidyverse / style

The tidyverse style guide for R code
https://style.tidyverse.org
Other
292 stars 100 forks source link

Advice for long formulas #187

Open MichaelChirico opened 2 years ago

MichaelChirico commented 2 years ago

Is there any advice we want to encode for long formulas?

Currently the guide only discusses formulas w.r.t. operator spacing, and only in the context of one-sided formulas:

https://style.tidyverse.org/syntax.html#infix-operators

It's quite common for us to see formulas stretch >1 line, even up to 4-5 lines or more. I am not sure what rules to apply for best readability in this case.

  1. Should a terminal ~ always induce indentation (so that the RHS is always outdented vs. the LHS)?
  2. Should a terminal + (or occasionally -) induce indentation on the next line?

Originally raised here: https://github.com/r-lib/styler/issues/900

Key example -- how should the following be formatted (as might come up, e.g., in a dcast() reshaping):

long_y_variable1 + long_y_variable_2 +
  long_y_variable_3 ~
    long_x_variable_1 + long_x_variable_2 +
      long_x_variable_3

Can any of those outdentations be removed?

Perhaps the problem is more generally about which operators do/don't induce indentation on the next line -- currently there's no discussion of long arithmetic statements either. It seems to me splitting on higher-precedence operators is a good general rule (so split on +, not *, where possible).

hadley commented 1 month ago

I would write that as:

long_y_variable1 + long_y_variable_2 + long_y_variable_3 ~
  long_x_variable_1 + long_x_variable_2 + long_x_variable_3

But I'm not sure what I'd do if the LHS and the RHS were both too long to fit one line. That feels like one reason that we've generally moved away from use of formulas in our interfaces (except for modelling, of course, where it still makes sense).