Open patrick-studer opened 1 month ago
Evening @patrick-studer ,
I was writing the test for the regex and just wanted to ensure that you would expect FIFO
and MY_FIFO
to fail. The regex expects at least one underscore to pass.
--Jeremy
Evening @patrick-studer ,
Disregard my previous comment. I just pushed an update to the issue-1202
branch. When you get a chance could you check it out on your end and let me know if it working for you?
Thanks,
--Jeremy
Hi @jeremiah-c-leary
Disregard my previous comment. I just pushed an update to the issue-1202 branch. When you get a chance could you check it out on your end and let me know if it working for you?
I could test your implementation and it does exactly what I expected. But your previous comment made me think a bit and I came to the conclusion, that my pattern should not be called Pascal_Snake_Case
(because it is not what officially is interpreted under this term)!
I was writing the test for the regex and just wanted to ensure that you would expect FIFO and MY_FIFO to fail. The regex expects at least one underscore to pass.
I see your point!
We have 2 problems - one with my Pascal_Snake_Case
implementation, and the other one in the existing PascalCase
/camelCase
regex.
Let's have a detailed look into it.
The pattern I requested was something like this "WhatA_WonderfulDay". This is PascalCase with the allowance of underscores (followed by a capital letter or a number). Without further thinking, I named this pattern Pascal_Snake_Case
...
According to definitions found in the web, Pascal_Snake_Case
should separate every single word with an underscore. Therefore, every capital letter must have a preceding underscore:
My given regex is not matching that pattern since my definition should allow multiple capital letters in one underscore-separated part. Therefore, I propose to additionally introduce the real Pascal_Snake_Case
pattern, as well as the Relaxed_PascalCase
:
Pascal_Snake_Case
(the real one)Relaxed_PascalCase
(PascalCase with underscores)If you have a better name for it, please feel free to change it!
Before I come to my proposal of a implementation, let me tackle the second problem I see.
This is now very complicated to explain. I hope you may follow my examples...
Your regex allows any number of capital letters, as long as they are followed by at least one lowercase letter (i.e. (?:[A-Z])+(?:[a-z0-9])+
)
This matches also for names, which do not follow the standard and should throw an error/waring (e.g. MyAXIBusArbiter/myAXIBusArbiter).
PascalCase/camelCase defines, that every word must start with a capital letter and the rest is lowercase.
For abbreviation, that means the correct way of writing would be MyAxiBusArbiter/myAxiBusArbiter.
If you add a number at the end of a name, its even matching everything (so not really helpful for linting).
First I thought, we should only allow 1 capital letter in a series, but this is also not good since there are "single-character-words" which lead to 2 capital letters in a row (e.g. WhatAWonderfulDay).
So, I tried to find a regex, which could deal with that but also does find the previously ignored errors...
I come up with the following proposal.
Pascal_Snake_Case
(the real one)Relaxed_PascalCase
(PascalCase with underscores)Here my updated code for the case-regex:
camelCase = re.compile("(?!.*[A-Z]{3})[a-z][a-zA-Z0-9]*")
PascalCase = re.compile("(?!.*[A-Z]{3})[A-Z][a-zA-Z0-9]*")
PascalSnakeCase = re.compile("(?!.*[A-Z]{3})[A-Z][a-z0-9]*(?:_[A-Z0-9][a-z0-9]*)*")
RelaxedPascalCase = re.compile("(?!.*[A-Z]{3})[A-Z][a-zA-Z0-9]*(?:_[A-Z0-9][a-zA-Z0-9]*)*")
The new thing I propose is a "negative lookahead" to mismatch if three capital letters in a row are found (i.e. (?!.*[A-Z]{3})
). The rest is basically the same, just without the non-capturing grouping.
I did a quick test with a set of port-names, checked with port_010 rule.
"ok" means, this is matching the regex and does not produce error/warning. "x" means does not match and will report error/warning.
You may see that I also added numbers to the example. This is to demonstrate the improvement given with my new regex patterns. (I hope nobody will call a signal/port like that - but for demonstration purpose it was good!)
port_010:case | lower | upper | upper_or_lower | camelCase | PascalCase | Pascal_Snake_Case | Relaxed_PascalCase |
---|---|---|---|---|---|---|---|
whatawonderful19day42 | ok | x | ok | ok | x | x | x |
whatAwonderful19day42 | x | x | x | ok | x | x | x |
WHATAWONDERFUL19DAY42 | x | ok | ok | x | x | x | x |
WHATaWONDERFUL19DAY42 | x | x | x | x | x | x | x |
what_a_wonderful_19_day_42 | ok | x | ok | x | x | x | x |
what_a_wonderful19_day42 | ok | x | ok | x | s | x | x |
WHAT_A_WONDERFUL_19_DAY_42 | x | ok | ok | x | x | x | x |
WHAT_A_WONDERFUL19_DAY42 | x | ok | ok | x | x | x | x |
whatAWonderful19Day42 | x | x | x | ok | x | x | x |
whatAWOnderful19Day42 | x | x | x | x | x | x | x |
WhatAWonderful19Day42 | x | x | x | x | ok | x | ok |
WhatAWOnderful19Day42 | x | x | x | x | x | x | x |
What_A_Wonderful_19_Day_42 | x | x | x | x | x | ok | ok |
What_A_WOnderful_19_Day_42 | x | x | x | x | x | x | ok |
WhatA_Wonderful19Day_42 | x | x | x | x | x | x | ok |
WhatA_WOnderful19Day_42 | x | x | x | x | x | x | ok |
WhatA_WONderful19Day_42 | x | x | x | x | x | x | x |
Is your feature request related to a problem? Please describe. The provided case-checking rules are currently (before v3.25) limited to:
lower
(which allows underscores too => _snakecase),upper
(which allows underscores too => _SCREAMING_SNAKECASE),upper_or_lower
(which allows one or the other, no mixture)PascalCase
(which does not allow underscores...),camelCase
(which does not allow underscores...).What I'm missing (or was missing before v3.25) is the possibility to define snaked-PascalCase. This is basically the same as PascalCase but allows optional separation/grouping of word groups with an underscore. The first letter of the word, and every letter after an underscore must be uppercase and the rest must be lowercase.
Example: _MyCore_DataChannelClockEnable
Describe the solution you'd like With the new release v3.25 you allow to introduce regex checks for the case-rules which is very handy if you have one rule which should differ from the "defaults". But since _Pascal_SnakeCase is a commonly uses formatting (and it is our default style for almost everything (signals, constants, variables, procedures, functions, ...)), it would be very nice to have it supported by VSG.
Would you mind adding the new case-keyword (like
Pascal_Snake_Case
) together with the matching regular expression string?_vsg/rules/caseutils.py:
Pascal_Snake_Case = re.compile("(((?:[A-Z])+(?:[a-z0-9])*)+_?)+")