rubberduck-vba / Rubberduck

Every programmer needs a rubberduck. COM add-in for the VBA & VB6 IDE (VBE).
https://rubberduckvba.com
GNU General Public License v3.0
1.91k stars 299 forks source link

Wildcards in whitelist #4504

Open SmileyFtW opened 5 years ago

SmileyFtW commented 5 years ago

It would be nice to be able to use wildcards for names in the code inspection whitelist: txtPlayerNumber1, txtPlayerNumber2, txtPlayerNumber3, etc, could be specified in a single entry as txtPlayerNumber#

retailcoder commented 5 years ago

The whitelist doubles as a list of valid prefixes and identifiers: txt would remove all "Hungarian notation" results for txt* identifiers.

Use meaningful names would probably still fire results for numbered identifiers though... but it's specifically designed to do so. Numbered identifiers are a code smell; in a vast majority of cases, it is preferable to use an array or a collection of some sort - hence the by-design nature of flagging numbered identifiers as poor names. In VB6 you could have control arrays, unfortunately that isn't an option with MSForms in VBA.

I don't completely disagree with supporting wildcards, but it seems to enable mass-undermining the use meaningful names inspection... something that can also be done by simply disabling it.

If the number of players is unknown at compile-time, consider creating the textbox controls at run-time - in which case it becomes easy to use a name that doesn't trigger the inspection, and then you can craft a UI that is more representative of your domain model. OTOH if there's a fixed number of players and the UI can reasonably be made at design-time, I'd probably still want to whitelist each control name individually.

So... I'm kinda torn on this one.

daFreeMan commented 5 years ago

Player1, Player2 ... PlayerX are pretty special cases for numbered identifiers where most sane people would say "yeah, those make sense". As such, I'm in the "Whitelist them individually" camp because there are cases where there are sane, logical reasons for giving identifiers such names. Besides, how many players are there going to be in your game?

If you're going to allow a wildcard, do you have to support . to represent a single character and * to represent 1 or more (or is that 0 or more) characters? Do you go full RegEx. Do you use file system globbing rules?

retailcoder commented 5 years ago

Player1, Player2 ... PlayerX are pretty special cases for numbered identifiers where most sane people would say "yeah, those make sense"

Depends. If I'm writing a game of Battleship, yes (fixed players, few players: Player1, Player2). If I'm making a game of Monopoly or Risk (variable players, more than 2 max players), or a game where the number of players is unknown and the maximum is arbitrary or irrelevant, then I'd rather have a players collection and avoid the numbering altogether.

As for wildcards, I'd go with ? for 0-1 character, and * for 0-n characters (regex 0-n is *, 1-n is +)... i.e. DOS-style. Regex and pattern matching seems overkill.

retailcoder commented 5 years ago

Regex and pattern matching seems overkill.

OTOH, this is an IDE, and we're matching identifiers. Pattern matching does make sense... But then I'd go RegEx over Like syntax, but then regular expressions aren't exactly trivial for everyone.. so maybe a flag to enable regex patterns, otherwise default to filesystem wildcards... and now this feels like scope/feature creep... for a feature that ultimately will use this to allow bad variable naming to go unflagged: a pattern like * essentially makes the entire inspection work hard for nothing.

The crux of the problem seems to be that it's impossible to annotate a form control with an @Ignore annotation. Not sure the solution is to make the whitelist support wildcards.

I'll keep this issue open and up-for-grabs, and won't refuse a PR that implements it... But I'm not sold at all on it being a good idea.

Vogel612 commented 5 years ago

An idea for implementation might be something like the following:

Allow Wildcard specifications with *, # and & for "anything", "number" and "text" These wildcards are not regex wildcards, but could be transformed to "proper regex" as follows:

* -> .*
# -> \d+
& -> \w+

patterns that only contain wildcards could be rejected. For better speed, the compiled regex patterns should be compiled and cached. Invalidating that regex cache is trivially done when the list of ignored identifiers is changed.

Regex matches should be bounded on both ends with an anchor.

bclothier commented 5 years ago

I also wonder if this would be better if instead of wildcard, we only allow a range, which would close the range that could be match:

Player[1-4] => permits only 4 entries Command([En]|[Dis])abled => permits only 2 entires; CommandEnabled or CommandDisabled.

This way, those can be expanded into an array behind the scene. That wouldn't then require regex matching at all.

comintern commented 5 years ago

@retailcoder - If the root issue is not being able to annotate @Ignore a form control, there are a couple ways we could address that specifically without descending down the rabbit hole of custom patterns at all.

One would be to allow whitelisting of AsTypeNames, for example allow whitelisting MSForms.Control or Excel.TextBox as types.

Another would be to extend the @Ignore annotation to allow for an identifier or class at the module level. Something like @Ignore HungarianNotation(MSForms.Control).

retailcoder commented 5 years ago

One would be to allow whitelisting of AsTypeNames, for example allow whitelisting MSForms.Control or Excel.TextBox as types.

Effectively whitelisting TextBox1 and ComboBox12?

comintern commented 5 years ago

That was my thought. Hell, you could use it to ignore oFoo and still get results for strFoo. By whitelisting Object.

SmileyFtW commented 5 years ago

In my situation I do use a collection for the players themselves as objects, but the issue is with the controls on the user form displaying player info for which there is a specific maximum quantity that is known at design time(1 to a maximum of 6 active players and 0 to 4 deactivated players). It is no big deal to explicitly include each one in the whitelist.

Of course, since I don’t know about the inner workings and multiple uses RD has in the whitelist, my request was a bit out of school. I agree with a simple ? for a single character and # for a single digit with * for any number of characters or digits. No need to get any more complex than that I would think.

Thanks for the consideration and the education.

David

SmileyFtW commented 5 years ago

I’d like the idea of an annotation in lieu of going the whitelist route, but don’t know how that would be implemented for controls on a form (which is what got me onto this) unless it was hosted in the code behind (in the module declarations section?).