jeremiah-c-leary / vhdl-style-guide

Style guide enforcement for VHDL
GNU General Public License v3.0
179 stars 38 forks source link

Different comment indent for pragmas #1054

Closed SittingDuc closed 7 months ago

SittingDuc commented 8 months ago

Hello,

My code-base contains a large count of what I call "pragmas" - instructions to the tools, usually -- synthesis translate_off or -- pragma optimisation_disable 14

Their defining property is they match a regexp '^-- pragma' or '^-- synthesis', which means they are in the first column always; not indented with the code they are interfering with.

By default, these pragmas fail rule 'comment_010' because they are not indented to match the code around them. My feature request is to add a rule to not fail 'comment_010' on pragmas that match the above regexp and still fail on any other comments that are not indented to match the surrounding code. (so simply disabling rule 'comment_010' is not going to save me)

An example file (I can't upload .vhd? really? oh well) is provided.

pragma_test.vhd.txt

At present this example shows two violations of comment_010, one on line 18 and one on 20. I am asking for a feature request, ideally for a new or custom rule (so everyone not-me can disable it) that has this example file report no failures.

Thank you,

jeremiah-c-leary commented 7 months ago

Afternoon @SittingDuc ,

I was wondering if -- pragma and -- synthesis are enough to determine if a comment is a pragma or not. I could see either as part of a larger comment block:

-- this construct .... gives the best
-- synthesis results

I wonder if -- pragma and -- synthesis would be too gready.

What do you think?

--Jeremy

SittingDuc commented 7 months ago

That is an interesting point and I hadn't considered it. Obviously my codebase is not representative but it is the one I want to apply the feature to. '-- pragmas' is showing up as a false positive. And '--synthesise' and '--synthesis' are showing up as false positives. So you are right, the bare word is too greedy.

Looking more closely at the codebase, the only lines I care about can be described as -- synthesis translate_off, --synthesis translate_on as literals, and then --pragma vendor something or other where 'vendor' is from a short list of one or two companies.

So to keep the feature request alive, it would need expanding to match '^-- synthesis translate_{on|off}$', and '^-- pragma vendor' where "vendor" is one or two string literals probably lifted out of the configuration json

jeremiah-c-leary commented 7 months ago

Morning @SittingDuc ,

I ran into this list of possible pragmas.

Given there are really no rules for the formatting of a pragma, what if I provide a configuration option like this:

pragma:
  - 'synthesis translate_off'
  - 'synthesis translate_on'

I could treat the entries as regular expressions:

pragma:
  - '--\s+synthesis\s+\w+\s*$'
  - '--\s+synthesis\s+\w+\s+\w+\s*$'
  - '--\s+pragma\s+\w+\s*$'
  - '--\s+pragma\s+\w+\s+\w+\s*$'
  - '--\s+vhdl_comp_[on|off]\s*$'
  - '--\s+altera\s+\w+\s*$' 
  - '--\s+RTL_SYNTHESIS\s+\w+\s*$' 

The above list would match everything in the link above.

I suppose I should add 'synopsys' and 'xilinx' too:

pragma:
  - '--\s+synthesis\s+\w+\s*$'
  - '--\s+synthesis\s+\w+\s+\w+\s*$'
  - '--\s+pragma\s+\w+\s*$'
  - '--\s+pragma\s+\w+\s+\w+\s*$'
  - '--\s+vhdl_comp_[on|off]\s*$'
  - '--\s+altera\s+\w+\s*$' 
  - '--\s+RTL_SYNTHESIS\s+\w+\s*$' 
  - '--\s+synopsys\s+\w+\s*$'
  - '--\s+synopsys\s+\w+\s+\w+\s*$'
  - '--\s+xilinx\s+\w+\s*$'
  - '--\s+xilinx\s+\w+\s+\w+\s*$'

This would allow users to update the list if something is not currently defined.

What do you think?

--Jeremy

jeremiah-c-leary commented 7 months ago

Afternoon @SittingDuc ,

I pushed an update for this issue to the issue-1054 branch.

I added a rule pragma_300 which allows you to control the indent of pragmas defined per a series of regular expressions. I added a configuring pragmas section to the documentation which describes how it works.

When you get a chance could you check it out on your end and let me know how it goes?

Thanks,

--Jeremy

SittingDuc commented 7 months ago

Hello,

(note to self, delete stale installed copies before adding comments on github, so you are running the right tool version.)

Ran vsg -oc original.json to get the new configuration options and then played with. Setting "indentSize": 0 on pragma_300 and testing against "real" VHDL code, the branch correctly managed -- synthesis translate_off as pragma_300 not comment_010, which is a win; and even better, it protested when I set the indent "wrong" as a test. So this part is working. :+1:

Attempting to add my own pragma regexps to my configuration .yaml produced a python Traceback.

  File ... /vsg/__main__.py, line 153, in main
    for tResult in pool.imap(f, enumerate(commandLineArguments.filename)):
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 873, in next
    raise value
KeyError: 'regexp'

File vsg/config.py has


225    dStyle = read_predefined_style(commandLineArguments.style)
226    add_pragma_regular_expressions(dStyle)
227    dConfig = read_configuration_files(dStyle, commandLineArguments)

I made the traceback go away by changing this to

    dStyle = read_predefined_style(commandLineArguments.style)
    dConfig = read_configuration_files(dStyle, commandLineArguments)
    add_pragma_regular_expressions(dConfig)

(right about now I wish I could get github fork + PR to actually work for me)

Finally, with my edit to vsg/config.py, and my custom regexp in config.yaml as 'pragma' 'patterns', I can parse "real" VHDL with both --synthesis translate_off and -- pragma duc optimise=4 lines being applied against parse_300 and not comment_010 and being indented as desired. :+1: and you helped me find a bug in our codebase where we had synthesis_translate_off which of course didn't do what I wanted :grinning:

Final note: with all the new regexp added, I do worry that I have gravely reduced performance (210 line test file, consistently runs vsg -ap -f bad_duc.vhg in 2.0s on branch 1054, and in 1.74s on branch 1053.. (maybe default pragma_300 to disabled?)

SittingDuc commented 7 months ago

Added a PR to your branch, I think?

jeremiah-c-leary commented 7 months ago

Morning @SittingDuc ,

Thanks for the PR. Still not 100% sure why that makes a difference yet. I added a couple of tests to validate the configuration of pragmas to ensure the problem does not come back.

and you helped me find a bug in our codebase where we had synthesis_translate_off which of course didn't do what I wanted

Sweet. It is nice to hear that this tool makes a difference.

Final note: with all the new regexp added, I do worry that I have gravely reduced performance (210 line test file, consistently runs vsg -ap -f bad_duc.vhg in 2.0s on branch 1054, and in 1.74s on branch 1053.. (maybe default pragma_300 to disabled?)

Yeah, the problem is each line needs to be checked against each regular expression. I would think pragmas would constitute less than 1% of the total lines. If there are 8 possible pragmas , then each line that is not a pragma will get checked 8 times.

Unfortunately disabling the rule will not improve run time because the pragmas are detected and classified before any rules are executed.

However, we could set the default pragmas to an empty list. This would short circuit the checks on every line.

I will also investigate any performance improvements I can make.

--Jeremy

jeremiah-c-leary commented 7 months ago

Morning @SittingDuc ,

I pushed an update where I check to see if a line is a comment before trying to apply the regexs.

Let me know if you see any performance improvements.

Thanks,

--Jeremy

SittingDuc commented 7 months ago

Hello,

I have updated to the latest commit and run some straw tests on four 'real' VHDL files representing about 4kloc.. I have a copy of VSG installed by PIP (3.18.0 it says), a copy from the issue-1053 branch, and two copies from the issue-1054 branch - friday with working regexp and monday/today with performance improvements.

With the default configuration, there is not much in it, branch-1053 is the same speed as pip, both at 4.2s, and branch-1054-friday is 5.4s, branch-1054-today is 5.3s. The largest number I see is 133% the pip runtime

With my local configuration file, pip doesn't run, branch-1053 is 8.5s, branch-1054-friday is 10.7s and branch-1054-today is 10.0s; so the performance improvements commit over the weekend is a full 7% faster, and my configuration file is about twice as slow as the default configuration.

Which I think is a lot of words to say (I feel) the performance improvement commit is worth keeping, and the end of the branch represents a working fix for the original feature request :+1:

jeremiah-c-leary commented 7 months ago

Evening @SittingDuc ,

The largest number I see is 133% the pip runtime

Wow, 133% is quite significant.

In order to go faster I'm afraid I would need to re-write VSG in another language like C++ or Rust.

Which I think is a lot of words to say (I feel) the performance improvement commit is worth keeping, and the end of the branch represents a working fix for the original feature request

Sweet, I will get this merged to master.

--Jeremy