r-lib / styler

Non-invasive pretty printing of R code
https://styler.r-lib.org
Other
703 stars 70 forks source link

Remove blank lines after opening and before closing braces #1195

Closed IndrajeetPatil closed 2 months ago

IndrajeetPatil commented 2 months ago

Closes #1032

This is only one example. Please check newly added tests to get a feel for this change:

styler::style_text(text = '(

  require("logspline") &&
  require("rstanarm")

)')
#> (
#>   require("logspline") &&
#>     require("rstanarm")
#> )

Created on 2024-05-01 with reprex v2.1.0

github-actions[bot] commented 2 months ago

This is how benchmark results would change (along with a 95% confidence interval in relative change) if f5744da371242da992fff46b35253aa80cba6844 is merged into main:

Further explanation regarding interpretation and methodology can be found in the documentation.

github-actions[bot] commented 2 months ago

This is how benchmark results would change (along with a 95% confidence interval in relative change) if 4b24ff67f262d8db3630cfb2d9e1cee384fc2469 is merged into main:

Further explanation regarding interpretation and methodology can be found in the documentation.

lorenzwalthert commented 2 months ago

Thanks. Out of experience (and because I produced bugs before), I know that removing line breaks is problematic when there are comments, also see https://styler.r-lib.org/articles/customizing_styler.html#showcasing-the-development-of-a-styling-rule. Hence: Can you modify some tests to cover situations with comments (and adapt the source code if necessary)? Also, the other thing we have to keep in mind is cache invalidation and transformer dropping. Here it seems not that they play a big role. Maybe we can also create developer documentation for that in CONTRIBUTING.md elsewhere, because some of it bit me in the foot already in the past multiple times (and there are other things I can't recall here).

IndrajeetPatil commented 2 months ago

Can you modify some tests to cover situations with comments (and adapt the source code if necessary)?

I have added some tests with comments. PTAL.

As for updating contributing guidelines, I think that can be handled in a separate PR, although it's not completely clear to me what needs to be added, but I can give it a go.

github-actions[bot] commented 2 months ago

This is how benchmark results would change (along with a 95% confidence interval in relative change) if 4b24ff67f262d8db3630cfb2d9e1cee384fc2469 is merged into main:

Further explanation regarding interpretation and methodology can be found in the documentation.

github-actions[bot] commented 2 months ago

This is how benchmark results would change (along with a 95% confidence interval in relative change) if 99afa8df27ab65c011bb89ff5437ad32c31bd093 is merged into main:

Further explanation regarding interpretation and methodology can be found in the documentation.

github-actions[bot] commented 2 months ago

This is how benchmark results would change (along with a 95% confidence interval in relative change) if 99afa8df27ab65c011bb89ff5437ad32c31bd093 is merged into main:

Further explanation regarding interpretation and methodology can be found in the documentation.

github-actions[bot] commented 2 months ago

This is how benchmark results would change (along with a 95% confidence interval in relative change) if 99afa8df27ab65c011bb89ff5437ad32c31bd093 is merged into main:

Further explanation regarding interpretation and methodology can be found in the documentation.

lorenzwalthert commented 2 months ago

Looks good in general, I don't know if we need to boost the testing suite so much though (you are adding 500 lines of tests here)... One concern is that testing time increases (on CRAN, but also during the local development cycle) and also it becomes harder to navigate a big test suite for developers. Can you cut down on a test or two?

lorenzwalthert commented 2 months ago

Also, can you explain the rationale for making two different functions? I initially was scared it would be a performance problem, but seems not... In general I think we prefer separation of concern (as you did with the implementation of separate functions), but on the other hand having an abstract rule (line breaks around braces) should also correspond to one transformer, so I am not sure we should split it up here.

IndrajeetPatil commented 2 months ago

Looks good in general, I don't know if we need to boost the testing suite so much though (you are adding 500 lines of tests here)... One concern is that testing time increases (on CRAN, but also during the local development cycle) and also it becomes harder to navigate a big test suite for developers. Can you cut down on a test or two?

I am literally reusing the existing tests. How can that be increasing test time? Note that most of the diffs are from data tree. I am not sure why we need them, but they introduce big diffs.

Some quantitative data on the time required to run tests (unit: seconds):

OS main PR
windows-latest-R 49 50
ubuntu-devel-R 66 65
mac-latest-R 48 49
IndrajeetPatil commented 2 months ago

Also, can you explain the rationale for making two different functions?

The guard clauses and the braces involved are completely different, and so it doesn't make sense to club them together. Doing so couples two separate concerns: removing leading blank lines and terminal blank lines.

github-actions[bot] commented 2 months ago

This is how benchmark results would change (along with a 95% confidence interval in relative change) if 0f9625a3c05888207a7d2d91ce5647e82af36ace is merged into main:

Further explanation regarding interpretation and methodology can be found in the documentation.

github-actions[bot] commented 2 months ago

This is how benchmark results would change (along with a 95% confidence interval in relative change) if 0f9625a3c05888207a7d2d91ce5647e82af36ace is merged into main:

Further explanation regarding interpretation and methodology can be found in the documentation.

github-actions[bot] commented 2 months ago

This is how benchmark results would change (along with a 95% confidence interval in relative change) if 0f9625a3c05888207a7d2d91ce5647e82af36ace is merged into main:

Further explanation regarding interpretation and methodology can be found in the documentation.

github-actions[bot] commented 2 months ago

This is how benchmark results would change (along with a 95% confidence interval in relative change) if 4f782529e2419471d9abcf60899bf28a34bd5268 is merged into main:

Further explanation regarding interpretation and methodology can be found in the documentation.

lorenzwalthert commented 2 months ago

I am literally reusing the existing tests. How can that be increasing test time? Note that most of the diffs are from data tree. I am not sure why we need them, but they introduce big diffs.

My bad sorry. Since the tree files are not rendered by default in this repo, I initially did not see they caused a large part of the diff. Let's discuss those in https://github.com/r-lib/styler/issues/1206.

lorenzwalthert commented 2 months ago

The guard clauses and the braces involved are completely different, and so it doesn't make sense to club them together. Doing so couples two separate concerns: removing leading blank lines and terminal blank lines.

I see that they are different, but different is relative, and if you look at the code base, you can see that there are rules like set_line_break_before_curly_opening or set_line_break_around_curly_curly, which have a rather large scope. The rule you add seems to be about removing redundant blank lines between an opening and closing brace. Many other line break rules are about whether or not there should be a line break, e.g. at the opening of a call, set_line_break_after_opening_if_call_is_multi_line governs that:

call(
  many = args
)
# vs 
call(many, 
  args = 2
)

By setting pd$lag_newlines to 1L or 0L, we implicitly remove extra blank lines after the opening ( and hence this rule handles extra blank lines for that context already.

In other situations, e.g. for style_line_break_around_curly, you will see that strict determines whether to set the number of line breaks between tokens to exactly one or if we should keep the number of blank lines if it is already more than one.

These things make me wonder how extra blank lines should be removed. Should there be only one fallback rule (which would be even greater in scope than those two that we have now) to set blank lines in general to at most one consecutive (exceptions to be defined) and other rules should maybe only deal with zero vs one blank line distinction?

In case you did not notice, the order of transformers can also play a role. You put yours last. To me, we could also put this rule first (hopefully tests would all pass).

I do appreciate this discussion with you, as I think it makes me think and explain the status quo (which is by no means perfect) and hopefully, this ultimately leeds to better code quality.

IndrajeetPatil commented 2 months ago

@lorenzwalthert Thanks for the detailed explanation!

Yes, makes sense now, and so I have switched to use only one transformer and I have also put it at the start, and all tests still pass. So this should be ready for another round of review.

Blocked by #1208

github-actions[bot] commented 2 months ago

This is how benchmark results would change (along with a 95% confidence interval in relative change) if 4f782529e2419471d9abcf60899bf28a34bd5268 is merged into main:

Further explanation regarding interpretation and methodology can be found in the documentation.

github-actions[bot] commented 2 months ago

This is how benchmark results would change (along with a 95% confidence interval in relative change) if 4f782529e2419471d9abcf60899bf28a34bd5268 is merged into main:

Further explanation regarding interpretation and methodology can be found in the documentation.

github-actions[bot] commented 2 months ago

This is how benchmark results would change (along with a 95% confidence interval in relative change) if 4f782529e2419471d9abcf60899bf28a34bd5268 is merged into main:

Further explanation regarding interpretation and methodology can be found in the documentation.

codecov[bot] commented 2 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 92.22%. Comparing base (1088ede) to head (18bae50).

:exclamation: Current head 18bae50 differs from pull request most recent head f9d4ab4

Please upload reports for the commit f9d4ab4 to get more accurate results.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #1195 +/- ## ========================================== + Coverage 90.76% 92.22% +1.46% ========================================== Files 46 46 Lines 2652 2675 +23 ========================================== + Hits 2407 2467 +60 + Misses 245 208 -37 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

github-actions[bot] commented 2 months ago

This is how benchmark results would change (along with a 95% confidence interval in relative change) if 4f782529e2419471d9abcf60899bf28a34bd5268 is merged into main:

Further explanation regarding interpretation and methodology can be found in the documentation.

github-actions[bot] commented 2 months ago

This is how benchmark results would change (along with a 95% confidence interval in relative change) if f9d4ab445effa922061063a0887d51a4213b2be4 is merged into main:

Further explanation regarding interpretation and methodology can be found in the documentation.