Closed harry-commin-enclustra closed 2 years ago
Hey @hcommin,
There are quite a few rules we could add for record types:
I may have missed a couple.
There are existing rules which already perform those checks. At this point it takes longer to write the tests and the documentation than it takes to implement the actual rule.
I'm not sure if these options could be handled similarly to simple cases like assert_002 or assert_004 because I'm not sure how this interacts with the indentation configuration. My guess is that there is no problem because the record elements simply have +1 indentation.
The default indent for a record_type_definition
is:
390 record_type_definition:
391 record_keyword:
392 token : current
393 after : '+1'
394 end_keyword:
395 token : '-1'
396 after : '-1'
and full_type_declaration
is:
205 full_type_declaration:
206 type_keyword:
207 token : current
208 after : current
So in your example:
-- record keyword on same line
type x is record
a : bit;
end record;
...and assuming the beginning indent is 2...
when VSG hits type
it finds the full_type_declaration
key and then the type_keyword
key. VSG sees that token
is set to current
so it assigns the indent of 2 to the type
keyword token. It then sees the after
key is current
and keeps the indent at 2.
When VSG hits the record
keyword, it checks the record_type_definition[record_keyword]
entry and assigns the record
token a value of 2 because of the current
key. VSG then sets the indent to 3 because of the +1
in the after
key.
This visualization might help:
-- Indent 2 2 2 2
type x is record
-- Indent 3 3 3 3
a : bit;
-- Indent 2 2 2
end record;
So in the second case, with the record on the next line it would actually match the indent in your example:
3 -- Indent 2 2 2
4 type x is
5 -- Indent 2
6 record
7 -- Indent 3 3 3 3
8 a : bit;
9 -- Indent 2 2 2
10 end record;
because the record keyword was given an indent of 2 and the indent of the record
keyword is only checked if record
it at the beginning of a line.
Let me know what you think of my list of rules and if I forgot anything.
Regards,
--Jeremy
@jeremiah-c-leary Thank you for the detailed example of indentation configuration. If I understood correctly, the indent is only checked for items that are at the beginning of a line (which seems obvious, now that I write it). So, if we want an item to be on its own line, then we use a rule... and the indentation configuration just sets the indentation correctly at the start of each line. For example, if we want to convert from this:
type x is record a : bit; end record;
to this:
type x
is record
a : bit;
end record;
then several "abc on its own line" and/or "no code after xyz" rules would need to be defined. Without these rules, the indentation configuration would only fix the indent:
type x is record a : bit; end record;
Is that correct?
Your list of record
rules makes me realize I only thought about 2 "valid" coding styles I am familiar with, and not the large family of violations we actually want to detect and correct.
Your list seems comprehensive to me. I inserted my minor comments in sub-bullets:
record
on the same line as is
? (See my example above with is record
on its own line).action : 'add'
etc.type
. I don't know if rules make sense for indents of identifier or is
.I can certainly attempt to help with writing documentation. My Python skills are a work in progress, so you may not want me going near your test code.
@jeremiah-c-leary In my workplace, we require a different suffix for record type identifiers (_r
, instead of _t
for all other types).
I don't mind just setting suffixes : ['_t', '_r']
for all types. But if it's easy to define independent rules, then there could be some value in that.
We are also required to suffix enumeration literals with _s
. So, would it be possible to add prefix/suffix rules for these identifiers too? Example:
-- Violation
type state_machine is (idle, write, read, done);
-- Fix
type state_machine is (idle_s, write_s, read_s, done_s);
Hey @hcommin,
Is that correct?
That is correct.
Rules are grouped into phases that would match the steps I would take in reformatting a file. Each phase transforms the code for the next phase to operate on. This means every rule in a later phase is making an assumption about what was performed in a previous phase.
To take your example to an extreme:
type
x
is
record a : bit; end
record
;
The first phase handles all the structural changes, like same line
, own line
and optional
rules and would transform to the code into:
type x is record
a : bit;
end record x;
The second phase handles whitespace, but in this example there are no whitespace issues because I put everything on it's own line. So the code stays as:
type x is record
a : bit;
end record x;
The third phase is vertical space, or blank lines. if we assumed we had a rule to add a blank line after the record keyword and before the end keyword, the code would be transformed into:
type x is record
a : bit;
end record x;
The fourth phase is indentation. So the code would be transformed into:
type x is record
a : bit;
end record x;
The fifth phase is alignment, which is anything that should be column aligned that is not an indent. In this case no rules apply. so the code stays as:
type x is record
a : bit;
end record x;
The sixth phase is capitalization. If we assume we configured a rule that the identifier must be uppercase, then the code would be transformed into:
type X is record
a : bit;
end record X;
The seventh phase checks for naming conventions. So if we configured the rule that all types must have a prefix of t_
, then VSG would log a violation since X
does not start with t_
.
Your list of record rules makes me realize I only thought about 2 "valid" coding styles I am familiar with, and not the large family of violations we actually want to detect and correct.
Yeah...when you really start to look at what it takes to enforce a certain coding style, then the number of rules increases dramatically.
Should this be: record on the same line as is? (See my example above with is record on its own line). I agree. How are rule conflicts handled? Will the user see an error if this rule is enabled at the same time as the one above? I may have a rule that can be configured for either, I would need to check. Otherwise, there is currently no method to handle conflicting rules. However, every attempt is made to ensure they are not conflicting.
Maybe I could implement a method to flag conflicting rules and notify the user.
Should this be keywords (plural)? I would only indent the first
record
keyword if it is on it's own line.A rule already exists for indent of type. I don't know if rules make sense for indents of identifier or is. We could implement rules for indenting identifier and
is
, but I would hope nobody would want to have those on their own lines. Although I do not see the purpose of havingrecord
on it's own line either. One thing I have learned from this project is their are a plethora of VHDL coding styles.
I can start implementing the rules above.
--Jeremy
I don't mind just setting suffixes : ['_t', '_r'] for all types. But if it's easy to define independent rules, then there could be some value in that.
I want to think that should not be too hard to implement. I would create an issue and we can explore it.
We are also required to suffix enumeration literals with _s. So, would it be possible to add prefix/suffix rules for these identifiers too?
To get an idea for how hard it would be to implement a rule, you can use the vsg_parser
script to see how VSG interprets a file. I implemented your example:
1
2 architecture rtl of fifo is
3
4 type state_machine is (idle, write, read, done);
5
6 begin end architecture rtl;
...and issued the following command:
$vsg_parser -f example.vhd
--------------------------------------------------------------------------------
0 |
--------------------------------------------------------------------------------
1 |
<class 'vsg.parser.blank_line'>
--------------------------------------------------------------------------------
2 | architecture rtl of fifo is
<class 'vsg.token.architecture_body.architecture_keyword'>
<class 'vsg.token.architecture_body.identifier'>
<class 'vsg.token.architecture_body.of_keyword'>
<class 'vsg.token.architecture_body.entity_name'>
<class 'vsg.token.architecture_body.is_keyword'>
--------------------------------------------------------------------------------
3 |
<class 'vsg.parser.blank_line'>
--------------------------------------------------------------------------------
4 | type state_machine is (idle, write, read, done);
<class 'vsg.token.full_type_declaration.type_keyword'>
<class 'vsg.token.full_type_declaration.identifier'>
<class 'vsg.token.full_type_declaration.is_keyword'>
<class 'vsg.token.enumeration_type_definition.open_parenthesis'>
<class 'vsg.token.enumeration_type_definition.enumeration_literal'>
<class 'vsg.token.enumeration_type_definition.comma'>
<class 'vsg.token.enumeration_type_definition.enumeration_literal'>
<class 'vsg.token.enumeration_type_definition.comma'>
<class 'vsg.token.enumeration_type_definition.enumeration_literal'>
<class 'vsg.token.enumeration_type_definition.comma'>
<class 'vsg.token.enumeration_type_definition.enumeration_literal'>
<class 'vsg.token.enumeration_type_definition.close_parenthesis'>
<class 'vsg.token.full_type_declaration.semicolon'>
--------------------------------------------------------------------------------
5 |
<class 'vsg.parser.blank_line'>
--------------------------------------------------------------------------------
6 | begin end architecture rtl;
<class 'vsg.token.architecture_body.begin_keyword'>
<class 'vsg.token.architecture_body.end_keyword'>
<class 'vsg.token.architecture_body.end_architecture_keyword'>
<class 'vsg.token.architecture_body.architecture_simple_name'>
<class 'vsg.token.architecture_body.semicolon'>
So this shows you how VSG is classifying the files and assigning tokens from vsg.token.
For this example, idle
, write
, read
and done
are classified as vsg.token.enumeration_type_definition.enumeration_literal
. So a rule can be written to apply prefixes/suffixes as current rule definitions, such as signal_600
work on a single token:
1
2 from vsg.rules import token_suffix
3
4 from vsg import token
5
6 lTokens = []
7 lTokens.append(token.signal_declaration.identifier)
8
9
10 class rule_600(token_suffix):
11 '''
12 This rule checks for valid suffixes on signal identifiers.
13 Default signal suffix is *\_s*.
14
15 |configuring_prefix_and_suffix_rules_link|
16
17 **Violation**
18
19 .. code-block:: vhdl
20
21 signal wr_en : std_logic;
22 signal rd_en : std_logic;
23
24 **Fix**
25
26 .. code-block:: vhdl
27
28 signal wr_en_s : std_logic;
29 signal rd_en_s : std_logic;
30 '''
31
32 def __init__(self):
33 token_suffix.__init__(self, 'signal', '600', lTokens)
34 self.suffixes = ['_s']
35 self.solution = 'Signal identifiers'
So it is just a matter of copying that rule and changing the token type:
1
2 from vsg.rules import token_suffix
3
4 from vsg import token
5
6 lTokens = []
7 lTokens.append(token.enumeration_type_definition.enumeration_literal)
8
9
10 class rule_600(token_suffix):
11 '''
12 This rule checks for valid suffixes on signal identifiers.
13 Default signal suffix is *\_s*.
14
15 |configuring_prefix_and_suffix_rules_link|
16
17 **Violation**
18
19 .. code-block:: vhdl
20
21 signal wr_en : std_logic;
22 signal rd_en : std_logic;
23
24 **Fix**
25
26 .. code-block:: vhdl
27
28 signal wr_en_s : std_logic;
29 signal rd_en_s : std_logic;
30 '''
31
32 def __init__(self):
33 token_suffix.__init__(self, 'signal', '600', lTokens)
34 self.suffixes = ['_s']
35 self.solution = 'Enumerated type'
The docstring and lines 33 and 35 would have to updated, but that kind of gives you an idea for what is involved.
So as long as you want every enumerated type to have a prefix/suffix, then we can create a rule for it. If there are special cases, for example different prefixes/suffixes if it is a record type versus not a record type, then it will get more difficult because VSG will have to know context.
If you would like prefix/suffix rules for enumerated types, go ahead and create another issue.
Regards,
--Jeremy
Hey @hcommin,
I have pushed an update for this to the issue-759 branch. When you get a chance could you check it out on your end.
I will plan to merge this to trunk on April 6th if I do not hear back from you by then.
Thanks,
--Jeremy
@jeremiah-c-leary
The behavior looks correct in my test cases. I need to spend a bit more time looking into the things you wrote above, as the coding guidelines in my workplace allow some differences for record
types (which don't fit really naturally with the LRM). But, on the technical side, I think this issue is ready to close.
I just spotted a couple of very minor documentation issues:
record_type_definition_004
in docs/record_type_definition_rules.rst (**block**
should be **record**
):record_type_definition_004
##########################
|phase_1| |error| |structure|
This rule checks the **is** keyword is on the same line as the **block** keyword.
on it's own line
should be replaced with on its own line
. (This probably applies to several documents).Hey @hcommin,
Thanks for catching the documentation issues. I have pushed an update.
--Jeremy
Rules for
end record
Currently, the type rules don't cover
end record
.For example, with this input:
the default fix is:
I guess the space before the semicolon is being handled by whitespace_003.
I'm not certain, but I guess 2 new rules matching these descriptions would make sense:
end
keyword".end
andrecord
keywords".Rules for
end record
optional nameThere is an optional
end record [ record_type_simple_name ]
defined in the VHDL LRM. There doesn't appear to be a type rule covering this.I guess this could be handled the same as other optional names. For example, the corresponding
entity
rules are:I'm not sure if an additional rule is needed to check for a single space between
end record
and the optional type name, or if this is already handled by another rule.Rules for line breaks
I am aware of 2 common styles. The
record
keyword belongs either on the same line astype
, or on its own line:I'm not sure if these options could be handled similarly to simple cases like assert_002 or assert_004 because I'm not sure how this interacts with the indentation configuration. My guess is that there is no problem because the record elements simply have +1 indentation.
I can see that a
vsg/token/record_type_definition.py
file exists, but I haven't wrapped my head around how indentation configuration works yet.