rubocop / ruby-style-guide

A community-driven Ruby coding style guide
https://rubystyle.guide
16.47k stars 3.4k forks source link

The '%w' and '%i' syntax for arrays is just a bad idea #908

Open nello opened 2 years ago

nello commented 2 years ago

Why are these forms favoured? They're harder to read than the original syntax and easily lead to errors with commas. I've personally lost hours chasing non-existent bugs thanks to this "feature" because the eye naturally skips over commas in lists.

I get that people seem to love new shiny things, but this is just a poor decision IMHO.

pirj commented 2 years ago

You must be referring to the %w guideline:

Prefer %w to the literal array syntax when you need an array of words (non-empty strings without spaces and special characters in them). Apply this rule only to arrays with two or more elements.

with the following examples:

# bad
STATES = ['draft', 'open', 'closed']

# good
STATES = %w[draft open closed]

Do you have some counterexamples in mind?

As the style guide strives to follow the community usage of the Ruby language,

A style guide that reflects real-world usage gets used

would you like to harvest real-world-ruby-apps meta-repo to have some evidence that the literal string array syntax is used more often than %w?

nello commented 2 years ago

Thanks for the reply.

The example that keeps occurring for me is the process of adding another token to a %w-ified list. As array elements in nearly all languages have commas between them, silently automatically adding the comma into the token just causes pain.

The other one is long lines:

STATES = %w[This is an example of what happens when you wrap over a line and there are more words than I,
or anyone else can parse in a gestalt]

Is there a way to harvest real-world usage from before the recommendation was added to the style guide? It has an undue influence for better or worse, as it gets rolled into tooling (especially IntelliJ/RubyMine in my case).

pirj commented 2 years ago

silently automatically adding the comma into the token just causes pain

The guideline non-ambiguously tells to "strings without ... special characters in them", where a comma is a special character. Such usage would be detected by a corresponding cop and would save you from the pain:

Within %w/%W, quotes and ',' are unnecessary and may be unwanted in the resulting strings

Are there other real daily usage examples that are problematic for you?

Is there a way to harvest real-world usage from before the recommendation was added to the style guide?

Things are as they are now. We don't doubt much about the effectiveness of boolean logic vs tri-state logic, using null, or byte length, even though, as an after-thought, those might not have been the best design decisions. I'm just stating that the guide reflects modern real-life usage, even though early iterations of the guide might have affected real-world usage.

bbatsov commented 2 years ago

Agreed. We generally try to avoid affecting real-world usage, but rather reflect it in the guide. Btw, I'm not a big fan of %-literals myself and I've often publicly criticized them, but we can't deny they have always been quite popular in the wild.

nello commented 2 years ago

Re the "corresponding cop" suggestion I'm pretty dubious that fixing issues exacerbated by tooling with more tooling is a virtuous cycle. I think that the guide has a degree of influence it's wrong to minimise. If a practice is poor, is popularity enough to make it a recommendation?

pirj commented 2 years ago

Are you confident enough to state that a certain practice is poor, basing on just your personal experience? If you look at this real-world- meta-repos, you would notice that not all of them use RuboCop, and they might even use alternative Ruby style guides. We are not affiliated with those meta-repos, and use them as an open data reference.

If a practice is poor, is popularity enough to make it a recommendation?

I understand what you mean. Yes. As Bozhidar mentions above about his personal preference to avoid %w, I have my personal preferences, too, that are going against what the style guide tells, usage of or/and being a notable example. Still, we try to be objective and unbiased towards personal preferences.

Drenmi commented 2 years ago

The argument for this is reduced git churn for multiline lists, as only one line will be touched when re-ordering, preserving git blame, etc. To that end, the argument of problems with commas apply equally to the "normal" form, as it has happened quite frequently at work that someone re-ordered an array literal and forgot to move the comma. 😅

To me the strongest argument against using this style is that it can't be consistently applied. I prefer rules that aren't applied conditionally, like this one is as it requires the qualifier "non-empty strings without spaces and special characters in them".

Personally I'm on the fence about which one is "better". There's clearly a case to be made for either. But as it's a community style guide, I think we need to tolerate that the community might not always share our personal preferences.

JohnCoates commented 1 year ago

would you like to harvest real-world-ruby-apps meta-repo to have some evidence that the literal string array syntax is used more often than %w?

@pirj I'm interested in this issue as well. Is there a guide for harvesting?

pirj commented 1 year ago

@JohnCoates something like:

  1. Clone it with sub modules (might require fixing refs as there are dead ones, and ones with renamed integration branches)
  2. Write a cop that would raise an offense for %i usage
  3. Write a cop that would detect %w
  4. Write a cop that would detect []
  5. Run those three cops separately against the metarepo. You may need to remove all .rubocop*.yml files from there.
  6. Compare the number of offences for each

there was a tool that would save you on steps 2-5 that just goes through a directory and prints all matched usages https://jonatas.github.io/fast/, but I have never used it. Grep would not be a viable option.

pirj commented 1 year ago

easily lead to errors with commas

I believe there is or should be a cop that would catch punctuation inside percent literals, as they certainly are unintentional in most cases if not always.

marknuzz commented 1 month ago

The argument for this is reduced git churn for multiline lists, as only one line will be touched when re-ordering, preserving git blame, etc.

@Drenmi IMO that's not a great argument, ideally you'd be using git-blame-ignore-revs which has been available since 2019.

That being said, it sounds like the OP doesn't believe in linters (which have been in use since 1978) and I feel that's the root cause of the OP's problem, not the style guide (which as others pointed out, also addresses the scenario). Linters are the only realistic way to ensure a consistent application of a style guide. By definition, if you aren't using them, you're not applying the style guide. The OP's admission that not all the rules are being followed is proof of that. So I feel this issue is moot.