Documentation: block_400

staerz commented 1 year ago

Environment Ubuntu, VSG 3.13.0

Describe the bug A minor, but possibly misleading flaw in the documentation of block_400.

While the violation gives

variable    var1 : natural;

as an example, the bare rule-centric fix should be

variable var1 : natural;

, that is to only shorten the space between variable and var1 down to 1 space, but not touching anything else, i.e. the space between var1 and the colon (as that is subject to rule block_401).

Additional context Although I was the author of #498 (2 years ago), now, given the huge number of different rules, I would generally like to erode the idea of simplifying similar rules, and block_400 is one of these: It's less about how the block keyword or structure purely related to it is formatted, but it's about a default signal declaration (within a block environment).

To my impression, it makes no difference in which context a signal is declared and I think it would be enough to have one single rule to define how that should be. So the suggestion is to make it context-free.

jeremiah-c-leary commented 1 year ago

Morning @staerz ,

Good catch on the documentation. I have pushed an update to the issue-859 branch.

--Jeremy

jeremiah-c-leary commented 1 year ago

Additional context Although I was the author of https://github.com/jeremiah-c-leary/vhdl-style-guide/issues/498 (2 years ago), now, given the huge number of different rules,

Can you believe it has been 2 years? You have been contributing for half the projects existence.

I would generally like to erode the idea of simplifying similar rules, and block_400 is one of these: It's less about how the block keyword or structure purely related to it is formatted, but it's about a default signal declaration (within a block environment).

To my impression, it makes no difference in which context a signal is declared and I think it would be enough to have one single rule to define how that should be. So the suggestion is to make it context-free.

I can understand the desire to collapse multiple similar rules into a single rule and it is something I have struggled with. The only difference between block_400, architecture_029, function_015, and procedure_010 is the location where it is applied. However, one of the problems I have with combining them is what would I name that single rule.

When I started this 4 years ago, it was more of an experiment to see if I could create something to check the formatting of VHDL code before going into a code review. I just used my knowledge of the VHDL syntax instead of referencing the LRM. This led to poorly named rule groups such as signal and constant. They should have been named signal_declaration and constant_declaration to match the LRM.

When it was clear the implementation using regular expressions was limiting my ability to implement features I decided to create my own parser. It was during that effort I made the decision that rules should strictly follow the scoping of the LRM. I realized I made several mistakes like issue #470.

So now my guideline for creating rules is to match the LRM. One could argue this results in rule inflation as a single rule could encompass multiple rules. One example includes the casing of keywords. I could have a single rule that would case all the keywords instead of using rule groups. However, at this point I was committed to keeping everything separate.

Fast forward to issue #850 .

The with is really a selected assignment of four different varieties:

concurrent_selected_signal_assignment ::=
  with expression select [ ? ]
    target <= [ guarded ] [ delay_mechanism ] selected_waveforms ;

selected_force_assignment ::=
  with expression select [ ? ]
    target <= force [ force_mode ] selected_expressions ;

selected_variable_assignment ::=
  with expression select [ ? ]
    target := selected_expressions ;

selected_waveform_assignment ::=
  with expression select [ ? ]
    target <= [ delay_mechanism ] selected_waveforms ;

I started creating rules at those levels for indenting, whitespace and casing. So four rules to case the with keyword, four rules to case the select keyword etc... It seemed a bit overkill, but I was following my guideline and I had the rule groups so the user would not have to configure each of them individually. However, things fell apart when I went to implement the structure rule for those statements. It seemed too burdensome to ask the user to configure four separate rules for what is essentially the same structure.

The question is what do I name the rule and still tie it back to the LRM? The four varieties do not converge in the LRM to a single point so I did not have anything in the LRM I could use. It eventually occurred to me that I could use selected_assignment as those two words were in the name of every variety and I felt it was descriptive enough. Once I made that decision I removed all the indenting, whitespace and casing rules from the individual varieties and re-implemented them under selected_assignment.

So I must modify my current guideline to allow for combining multiple rules into a single rule. This could be tricky though as I think it is important not to combine too many rules into a single rule. I believe I should use the LRM to help determine the name I should use.

Bringing this back to your suggestion, signals etc... are defined in the declarative_part of multiple structures:

architecture_body ::=
  architecture identifier of entity_name is
    architecture_declarative_part
  begin
    architecture_statement_part
  end [ architecture ] [ architecture_simple_name ] ;

block_statement ::=
  block_label :
    block [ ( guard_condition ) ] [ is ]
      block_header
      block_declarative_part
    begin
      block_statement_part
    end block [ block_label ] ;

package_body ::=
  package body package_simple_name is
    package_body_declarative_part
  end [ package body ] [ package_simple_name ] ;

package_declaration ::=
  package identifier is
    package_header
    package_declarative_part
  end [ package ] [ package_simple_name ] ;

process_statement ::=
  [ process_label : ]
    [ postponed ] process [ ( process_sensitivity_list ) ] [ is ]
      process_declarative_part
    begin
      process_statement_part
    end [ postponed ] process [ process_label ] ;

subprogram_body ::=
  subprogram_specification is
    subprogram_declarative_part
  begin
    subprogram_statement_part
  end [ subprogram_kind ] [ designator ] ;

Looking at names of where signals etc... can be created I would use the name declarative_part. I could put all rules that cover how individual elements into those regions behave. That would include alignment and blank line rules. So now I can collapse block_400, architecture_029, function_015, and procedure_010 into a single rule under declarative_part.

I do not think I can justify collapsing all the keyword case rules into a single rule. I can, but it seems the scope would not support such a change. I also wonder if it will make testing harder.

I apologize if this was too long winded to finally get to the point of agreeing with you that I can reduce the number of rules. It is important to keep the design of VSG internally consistent. I also feel if you have been willing to stay on this rollercoaster for 2 years that you are committed for the long haul and deserved an explanation of why the rules are structured as they are. I also appreciate your feedback and insights and the more information I can give you the better the feedback.

Regards,

--Jeremy

staerz commented 1 year ago

Hi @jeremiah-c-leary,

thanks for your exhaustive explanation, it is very appreciated.

I've done a bit of archaeology in our project, and the first mention of VSG is from 23 September 2019 when we were identifying tools for VHDL style.

I see and perfectly understand the difficulty of making choices of your own implementation of a syntax parser with respect to a programming language, or more rigorously, a language reference manual. Personally, I'd also stick to the LRM as that's the ultimate reference, but I agree, that's a learning experience and I'd have the same revelations.

That said, we agree on the general goal here it seems.

I'm not saying that VSG needs to entirely redefine everything (it's OK to have a dedicated rule for the casing of each syntax key word, for example). For rules that cover structure, I though am hugely in favour of abstracting things: E.g. if architecture_declarative_part and block_declarative_part allow the same definitions (of e.g. signals), then I'd think of a base class rule implementation, and yes, if you want, a derived rule for each kind, but actually more of an implicitly derived realisation of that base class rule implementation.

From a user perspective (who usually doesn't know all details about the LRM or might even not know about its existence) though, a signal declaration is a signal declaration and I (as a user) know that I may put one directly under the architecture, or have one in a block (etc.), but each time I apply the very same syntax and style rules. So why should I hence have to configure all varieties of it in/for a style checking tool?

So I think that's where the drive for

combining multiple rules into a single rule

comes from and there is merit:

it reduces the rules to maintain to a minimum
it reduces the configuration to maintain to a minimum
it reduces the documentation to maintain to a minimum

(Abstraction is good!)

one of the problems I have with combining them is what would I name that single rule.

Don't you worry about that, that's the easiest to solve! For the particular example, well, we should analyse first, what can be in the declarative_part: It can be signals, variables (sometimes), functions, procedures, etc. ... each of these have their specific structural rules and I'd keep those separate. If I had to name these, I'd go for sth. like "signal_declaration", "variable_declaration" etc.

And then, adding one more level of abstraction, signal declarations, variable declarations and constant declarations are very similar as well: After the assignment operator, they could contain the same content (hence structure): I would aim for implementing this with another abstracted base class.

So let's distinguish between the different sorts of VSG rules: blank line rules are OK to exist for every single kind of hierarchy element (they could use an abstracted base rule implementation under the bonnet though, but that's not visible to the user), casing rules same story. Declarative rules should be context free though and indentation should follow accordingly. Similarly statements (in the various allowed statement_parts) should also be context free.

If you agree to this general approach and see merit for VSG to go towards it in future, then I'd suggest you to open a dedicated issue where we'll take the time to go over what we have and what looks "similar" to me and where we can/should possibly combine and abstract things. I do not expect a solution tomorrow, but I think there would be a huge benefit from it in the long run.

jeremiah-c-leary commented 1 year ago

Morning @staerz ,

If you agree to this general approach and see merit for VSG to go towards it in future, then I'd suggest you to open a dedicated issue where we'll take the time to go over what we have and what looks "similar" to me and where we can/should possibly combine and abstract things. I do not expect a solution tomorrow, but I think there would be a huge benefit from it in the long run.

I agree in general to your approach. I would like to mirror the effort we took to define constraint rules. That process worked out well. As a bonus we will have the methodology and the justifications for it documented. I can then reference it when new rules are added. This will in turn provide a consistent view to the end user and hopefully make it easier for them to use VSG. It will also make development easier on me. So I can see no downside to persuing this.

I will create another issue to cover the documentation and begin with an initial proposal for how to go forward.

I will also close this issue after the new one is created.

Regards,

--Jeremy

staerz commented 1 year ago

Alright, works for me.

Maybe just use this issue to fix the documentation of block_400 and have us close this issue. The new issue then on new methods ... (we deviated in our discussion from the original purpose of that issue, time to come back to it...)

jeremiah-c-leary commented 1 year ago

Morning @staerz ,

I agree. I have created issue #859 to track the documentation changes.

The updated documentation for this issue has been pushed to the issue-859 branch.

--Jeremy

staerz commented 1 year ago

Yes, I had a look at the updated doc and it looks good.

Ready to merge from my side :+1:

jeremiah-c-leary commented 1 year ago

Sweet I will merge this to master.

--Jeremy

jeremiah-c-leary / vhdl-style-guide

Documentation: block_400 #859