Integration with current VHDL VSCode extensions

jeremiah-c-leary / vhdl-style-guide

Style guide enforcement for VHDL

GNU General Public License v3.0

177 stars 38 forks source link

Integration with current VHDL VSCode extensions #328

Open GlenNicholls opened 4 years ago

GlenNicholls commented 4 years ago

What is your question? DISCLAIMER: This is a question and suggestion. let me know if you want me to seperate the suggestion into a seperate issue for tracking my feature request.

I actively use the rust_hdl_vscode extension and contribute on occasion as time permits. One feature that is missing is code beautification and this project might be able to provide that functionality. Are there future plans to support integrating this project with VSC extensions? Even if beautification is not on the short-term road map, allowing the user to use this in VSC by configuring settings for their personal/corporate style would allow this project to be used much more broadly. Not only this, but it would be nice to get real-time feedback about VHDL style.

Things like:

rising_edge()/falling_edge() vs. clk'event
parenthesis like
- port map ( vs
- port map (
indentation style (tabs, spaces, number of spaces for indentation)
ieee standard libraries (numeric_std vs. std_logic_arith)
synchronous vs. asynchronous resets
camel case vs. underscores
port map delimiters like <i | o | io>_*, *_<i | o | io>, <In | Out | IO>*, or *<In | Out | IO> placing port directions in port names
specific casing for generics
- g_*, *_g, all caps with underscores
specific casing for constants
- c_*, all caps with underscores
parenthesis surrounding conditionals in if else or case statements
naming processes
specific delimeters for "blocks" of code
- blk_*/*_blk for block name
- gen_*/*_gen for generate
- gen_loop_*/*_gen_loop for for generate loop
aligning : (and also => for entity instantation) for entity/component declarations for input/output ports, all ports, etc.
aligning signal declarations like
- signal en : std_logic := '0'; signal en123 : std_logic_vector(1 downto 0) := (others => '0');
- to
```
signal en    : std_logic := '0';
signal en123 : std_logic_vector(1 downto 0) := (others => '0');
```

One thing that I've found is there are certain things that people disagree on regarding VHDL style (parenthesis placement being a big one), so allowing customization of these preferences would be invaluable. By allowing the user to enable/disable (and customize settings for enabled features) would really help this project take off and attract new contributions, feature requests, and bug reports.

With a code style-guide like this, I can see how it is tempting to have "one standard to rule them all". However, I've found that firmware developers have massively varying opinions, even within a single organization. Allowing the user to completely configure settings like this (think resets with ASIC vs. FPGA) would allow an organization to not only find something like this useful, but also be able to tune it to the style that their developers agree with. Until there is an agreed upon standard for VHDL like PEP-8, I don't think enforcing what you believe to be best is the best approach.

nfrancque commented 4 years ago

Some support for these types of configurations but I would agree there's some work yet to be done there.

Questions like this kind of make you wonder where the line between rust_hdl and VSG should be drawn.

Especially if we end up implementing this on top of rusthdl or pyvhdlparser, the line is very very blurred. For example, a coding standard mandating registers start with q requires some in depth analysis.

Jeremy can comment on the configuration options - we've talked offline that it'd be a really great goal to work with the FOSS community and define a PEP8-like standard. But I see your point that maybe that curtails usage/contributions.

GlenNicholls commented 4 years ago

@nfrancque

Questions like this kind of make you wonder where the line between rust_hdl and VSG should be drawn.

I agree. I don't think that coupling this tightly with another tool is a good idea as I'm sure many users would like to simply run this standalone or a CI job against code that is committed. My hope with this issue was to point out that making it possible to integrate this with VSC and other tools would be a nice feature!

Especially if we end up implementing this on top of rusthdl or pyvhdlparser, the line is very very blurred. For example, a coding standard mandating registers start with q requires some in depth analysis.

I see your point regarding in depth analysis. However, all the code I see has varying styles regarding constants/generics and so forth. A capability like this would allow an organizatino to set a desired style to keep code consistent. I'm of the mindset that it should be up to the developer to write clean and consistent code as long as VHDL "best practices" are used at a minimum. On that same token, though, I have also read and edited a LOT of code that is just a rats nest of styles that make it tremendously difficult to understand, let alone fix/edit without breaking existing functionality (think large single process state machines).

Provoding functionality for more strict style-checking could be extremely useful. For the company I work for, quite a few of us like the idea of making our requirements stricter simply for maintainability/readibility of our code and this project seems to have the ability to add these types of features, albeit with much more work. This hurts us 10x more when the original authors of a module/library leave the company.

With Git and Atlassian tools, it is easy to set up code-reviews, but some level of automation for code structure would be nice as man-hours would instead be spent following the logic of the code instead of checking a long list of a companies style guide.

Jeremy can comment on the configuration options - we've talked offline that it'd be a really great goal to work with the FOSS community and define a PEP8-like standard. But I see your point that maybe that curtails usage/contributions.

In regards to a PEP8 like standard, I think this is going to be a tough nut to crack. I think starting polls about different aspects of the language is a good starting point. In my mind, this is something that would have to be a light checker that only looks at certain parts of the language like tabs/spaces, standard libraries, structure of conditional statements, maybe resets (active high/low sync/async), sensitivity list completion, checking all states covered in generate and case statements, spacing between blocks of code, etc.

jeremiah-c-leary commented 4 years ago

Greetings @GlenNicholls,

I am not familiar with VSCode extensions, but it seems worth investigating. I attempted to structure the core of VSG so it could be included in other tools. So I would hope it could be made into an extension.

One thing that I've found is there are certain things that people disagree on regarding VHDL style (parenthesis placement being a big one), so allowing customization of these preferences would be invaluable. By allowing the user to enable/disable (and customize settings for enabled features) would really help this project take off and attract new contributions, feature requests, and bug reports.

There are differing opinions on style within my own department. Some people are more vocal than others. Once I started using VSG on my projects, I would invariably hear from them about some formatting they did not like. As I reflect back, the tool was surfacing opinions about style before the code review. Our style guide is very vague, so we end up with an "unofficial" style guide that nobody knows about until you get to a code review. Discussing style issues in a code review is frustrating.

VSG was designed to be fully customizable. Any rule can be disabled, some can be configured (for example upper and lower case), using a JSON or YAML file. VSG also allows the user to write their own rules. There was another group in my company that wanted a slightly different rule than what I had. So I sat down with @csomers82 and we crafted the new rule and disabled the old one. Now they have a customized version of VSG for their code base.

With Git and Atlassian tools, it is easy to set up code-reviews, but some level of automation for code structure would be nice as man-hours would instead be spent following the logic of the code instead of checking a long list of a companies style guide.

We use Atlassian tools also, and it was a particular code review I attended that prompted me to start VSG. There was a code bug that was missed because I was distracted by coding style. Now, we run VSG as part of our Bamboo jobs to enforce a consistent coding style.

I remember the code reviews after using VSG being more productive because there were very few code style issues. There is seems to be a shift in your thinking when reviewing code you know has already been vetted for style. Probably because the code style doesn't stand out because it matches what you expect.

I also remember one code review where we decided on a code style issue that was not covered by VSG. I wrote a new rule and ran VSG against the entire code base. It found several files where the new rule was violated and fixed them. I then checked the modified files in.

With a code style-guide like this, I can see how it is tempting to have "one standard to rule them all". However, I've found that firmware developers have massively varying opinions, even within a single organization. Allowing the user to completely configure settings like this (think resets with ASIC vs. FPGA) would allow an organization to not only find something like this useful, but also be able to tune it to the style that their developers agree with. Until there is an agreed upon standard for VHDL like PEP-8, I don't think enforcing what you believe to be best is the best approach.

In regards to a PEP8 like standard, I think this is going to be a tough nut to crack. I think starting polls about different aspects of the language is a good starting point. In my mind, this is something that would have to be a light checker that only looks at certain parts of the language like tabs/spaces, standard libraries, structure of conditional statements, maybe resets (active high/low sync/async), sensitivity list completion, checking all states covered in generate and case statements, spacing between blocks of code, etc.

@m-kru and I have had some discussions on this in issue#220. VSG started out essentially expressing my opinion on how code should be formatted. For a certain portion of developers, one style is as good as another. Others have stronger opinions or guidelines they need to follow for their company.

We discussed having a smaller set of rules that would run out of the box. Then the user would have to configure the rest of the rules. There are currently over 350 of them though. So now the burden is on the person configuring the rules to enable a subset of them. Although that is pretty typical for EDA tools. Not that it is very difficult, it is just the number of rules that would need to be configured.

I had thought it would be interesting to have a gallary of styles available. Then a user can swap between styles to find one that is 90+% of what they want and then configure what they do not like. These styles could be submitted by the users of VSG. Maybe with enough of these styles defined by users, they could be mined for the minimum set VSG should enforce.

Regards,

--Jeremy

eine commented 4 years ago

@GlenNicholls, if you missed it, please do read #312. The content does not correspond with the title, as it is a thoroughtful discussion about the current state of the art regarding open source parsers for (modern) VHDL. You will find many similarities with the discussion about VUnit, OSVVM and UVVM we had a couple of days ago (January 23, 2020 2:53 AM).

I believe that most of the features you suggest do require some (or much) semantic analysis, and not only parsing. There is no open source tool that provides semantic analysis yet, except GHDL. However, GHDL does it internally, and it is not specially thought to be used for this purpose. It can be done (some info at https://ghdl.readthedocs.io/en/latest/internals/Overview.html), but it is not plug and play. Hence, it is compulsory as a community to help GHDL, rust_hdl and/or PyVHDLParser, so that semantic analysis of modern VHDL with FOSS tools is possible. Of course, it is equally possible to start or enforce some other project. Precisely, the discussion in #312 started because this project was considering to duplicate existing work. Nevertheless, I believe we'd better focus on 1-2.

First, we need a tool that can read VHDL, understand it, and print it out again. GHDL does currently NOT preserve comments. Both rust_hdl and PyVHDLParser do preserve comments and both are expected to support generating sources from the AST. However, I don't know if the feature is available already. That feature alone might allow to do some easy modifications, such as replacing rising_edge with clk'event. Still, it might require some hackish solution.

To implement rules such as context-dependent parenthesis or signal/variable renaming, the semantic analysis is required. Since names can be overloaded/hidden in VHDL, simple renaming is very limited. You need the token/identifier and the context/meaning too.

I believe that only after we achieve the steps before will this project be actually feasible as we envision it. Coherently, I think that the scope of vhdl-style-guide should be focused on providing the definition of a configuration file and the set of rules that users can select. However, the actual implementation of the modifiers should be based on an existing tool. This is not to make this project smaller, but to make it bigger from a functionality point of view rather than from a LOC perspective. Therefore, I completely support the discussion proposed in this issue. I just think that it might be too early to have most it implemented.

Note that @kraigher has commented in different contexts that his focus with rust_hdl is on providing a toolkit that other developers can use to build useful features. Hence, it is not on his roadmap to code the "style checking", "style modification" or "documentation generation" components of the ecosystem. He will support those use cases by extending the "core" of rust_hdl if required. However, additional human resources are needed. That's where I think that this project fits. Of course, this same assertion is valid for PyVHDLParser or GHDL too. The case with @Paebbels is slightly different, as I think that he is willing to write some of this style related features himself. Anyway, due to time constraints, we should consider all three projects to be in a similar state.

Are there future plans to support integrating this project with VSC extensions?

I find this question (and the title) misleading. Although you seem to be interested on using this project as a VSC extension, most of the content of your first message (and later comments) is about "style checking and modification" features that are kind of independent from VSC. Don't take me wrong: I am interested on using this project in VSC too (as it is the main editor I use). However, I think it should be discussed in a different issue to the long list of possible rules.

I can see how it is tempting to have "one standard to rule them all". However, I've found that firmware developers have massively varying opinions, even within a single organization. Allowing the user to completely configure settings like this (think resets with ASIC vs. FPGA) would allow an organization to not only find something like this useful, but also be able to tune it to the style that their developers agree with.

I believe there are a couple of different issues here. On the one hand, the discussion between having an opinionated formatter or a configurable one has already been long discussed for other languages. I believe it is the most time-consuming and unproductive discussion, because there is no "true" answer; so, we should avoid it. I do agree with you that being configurable would probably make it more appealing for companies/organisations. However, the development/maintenance burden needs to be considered. I believe the priority is to have a single opinionated tool with a single solution for the most basic rules we need to check, which is easy to extend. If some of us, or some organisation, does not like the defaults, it is up to anyone to contribute. Any specific discussion will be much easier when adding options through PRs is possible. The current problem is that VSG is no longer easy to be extended in order to support many of the features you suggest.

On the other hand, I think we should make a difference between modifications that affect style strictly and others that might have further implications. Adding/removing some space around a parenthesis is not the same as replacing risig_edge. I believe that layout (https://vhdl-style-guide.readthedocs.io/en/latest/phases.html) is the easier part to start with, even if multiple options are provided.

Until there is an agreed upon standard for VHDL like PEP-8, I don't think enforcing what you believe to be best is the best approach.

I believe that we should not overlook this. I am convinced that no "standard VHDL style" exists because there is no automatic formatting tool yet. As you saw on the discussion the other day, there are strong and conflicting style opinions in the VASG. This is seen on the standard and IEEE VHDL libs, where there is no common style/naming. It is easy to understand that no one is willing to manually check all the sources to make the style coherent. Furthermore, that person would not know which rules to follow (except her/his own).

However, if we had a configurable formatter, such as the one we are discussing here, it would be trivial to choose any arbitrary style for the standard. Any user would just run the formatter with a local configuration, right after cloning the sources; and with the remote configuration just before pushing. I believe this should be the goal of VSG. Still, as said, there is some hard work ahead.

I had thought it would be interesting to have a gallary of styles available. Then a user can swap between styles to find one that is 90+% of what they want and then configure what they do not like. These styles could be submitted by the users of VSG. Maybe with enough of these styles defined by users, they could be mined for the minimum set VSG should enforce.

@jeremiah-c-leary, I believe that this is the strongest point of this project. Managing +350 rules is a complex task itself. Supporting additional file formats, multiple (nested) configurations files in a repo, or providing a cli option to select between a list of available styles... are all means of enhancement, instead of fighting with parsing non-trivial VHDL.

Regarding the gallery, I believe that a as-simple-as-possible web site with a dropdown list that shows an example code with several styles would be really useful. Something similar to https://swapoff.org/chroma/playground/

kraigher commented 4 years ago

FYI rust_hdl supports enough semantic analysis now to resolve all references to non-overloaded names such as constants, signals, variables, design units etc. That is why we can support goto definition and find references in the languahe server.

For overloaded names like functions there is still some work but we do now determine their signature and thus duplicate function definitions can be detected.

As was mentioned here the goal of rust_hdl is to build a toolbox as a team effort. Today it consists of two libraries "vhdl_lang" and "vhdl_ls" which is the core language front end and a language server based on the core.

I will not write a style checker or code formatter even though I desire them precisely because I see it as suitable entry point into the project for other developers. It is the major non-technical goal to ensure the project is not a one man show.

m-kru commented 4 years ago

Again, noble but unproductive discussion. We all agree, that it would be much better to have single parser generating AST/CST, but nothing is going to change and each project is going to grow in its own complexity. I think VSG could be relatively easily rewritten to use rust_hdl, however what is the point if all other tools are still going to use their own parsers. There would need to be a more common agreement to take a step back, in order to achieve more in the future. Maybe @eine could be a person trying to unite people.

eine commented 4 years ago

@m-kru, I'm doing my best... https://github.com/ghdl/ghdl/issues/1393#issuecomment-658946312

BTW, I was happy to read that @jeremiah-c-leary's explanation in https://github.com/jeremiah-c-leary/vhdl-style-guide/issues/312#issuecomment-660405030 is similar to the following quote from above:

I believe that this is the strongest point of this project. Managing +350 rules is a complex task itself. Supporting additional file formats, multiple (nested) configurations files in a repo, or providing a cli option to select between a list of available styles... are all means of enhancement, instead of fighting with parsing non-trivial VHDL.

Specially, because @nfrancque proposed to discuss how to define the API.

m-kru commented 4 years ago

@eine I am trying to define tree-sitter for VHDL https://github.com/m-kru/tree-sitter-vhdl. In my humble opinion it would be really good to have it as a single entry point for different tools. The parser is described in fully declarative way. Everyone can add rules, even without programming knowledge. If I succeed, I plan to replace vhdl_ls parser with the one generated by the tree-sitter and extend language server features. The output product is Concrete Syntax Tree (CST). Such tree can be trimmed and used for compilation/synthesis purposes. I can even image, that current GHDL parser is replaced with it. Adding new parsing functionalities for parser defined with tree-sitter takes minutes. Try to add parsing rules to the GHDL. If you are not @tgingold it will probably take you hours or days.

eine commented 4 years ago

@m-kru, based on previous discussions, using a "general purpose" parser generator for VHDL is a dead-end. It might be enough to parse the most common features of VHDL 1993, and maybe some of VHDL 2008. But there seems to be consensus in both VHDL and Verilog being complex enough to require custom parsing architectures. See October 18, 2019 1:32 AM and October 17, 2019 4:02 PM.

Nonetheless, there are multiple projects which provide very useful features, even though they don't support all the language. For example Symbolator, @Nic30's many projects, or VUnit's dependency scanner.

Of course, I'd love to be wrong and I wish you all the best with tree-sitter-vhdl. If it works, I believe it might be used by many other tools. Maybe not rust_hdl, GHDL or PyVHDLParser (which have already done the work of designing the custom parsing architecture), but other tools in the ecosystem (such as this repo).

Precisely, we have discussed in VUnit whether to use an external parser (VUnit/vunit#529). We have not done it mostly because GHDL and rust_hdl are compiled tools, and PyVHDLParser is not ready yet. I see that tree-sitter has bindings for Python. Hence, if tree-sitter-vhdl provides enough features, it might be suitable to use it in VUnit. Note that I mean VUnit's Python features as a builder/test-runner (which is compatible with any HDL testing/verification framework).

Nic30 commented 4 years ago

I can potentially programmatically migrate hdlConvertor grammars to tree-sitter from Antrl4. However I have never seen a comparison where Antrl4 lost against the tree-sitter (I am trying to help wherever I can, if you know about something, please share. ).

@m-kru

I think VSG could be relatively easily rewritten to use rust_hdl, however what is the point if all other tools are still going to use their own parsers

Exactly, but how to convince others to actually use any parser if it is not 100% complete. How to 100% finish it if it requires 1e6 man hours to do so. I was thinking that I did solved it in hdlConvertor where I used ANTRL4 parsers with 100% language support which are generating CST and the convertors which can convert it to a AST (same for VHDL and SystemVerilog) with support for most common code constructs. Still only one who is actively using it is my university and several micro companies which I am in the contact with.

I mean I was also in the contact with @eine , this things take serious time. It is extremely hard to convince anyone who has something half working that he/she should use your 90% (actually 51%) working tool. I do recommend to not relay on the fact that the users can easily work with the tree-sitter as it is very small part of the problem there.

eine commented 4 years ago

@Nic30, thanks for your insight. I think you nailed it. Any of the projects we are talking about is far beyond a single human's capacity, and still most of the projects are single person shows. I believe that is because stepping into any of them is very hard, and most new users are daunted before even trying it. Many do start their own project, believing that "there must be some easier way to deal with it". But, in the end, all comes down to learning the LRM almost by heart, and that requires a lot of time and expertise.

Yet, I think we are failing as a community when communicating that to new users. The problem is not exactly parsing, but standarization of the CST/AST and further (semantic) analysis based on the AST. So, this is about understanding and time, not about coding.

Anyway, @Nic30, I believe you might want to have a look at https://github.com/jeremiah-c-leary/vhdl-style-guide/issues/312#issuecomment-660405030. I did comment about GHDL, rust_hdl and PyVHDLParser before because I thought that you were mostly focused on the waveviewer, schematicviewer, etc. and not so much on a parser with 100% language support. However, if your ANTRL4 based solution works for this project, it might be a good opportunity to discuss/document/standarize the API.

m-kru commented 4 years ago

@Nic30 here is my story and thoughts.

I have started few months ago by analyzing GHDL internals. I gave up after 2 weeks because of ... Ada. Some people say it should be easy to get into GHDL, because Ada is very similar to the VHDL and hardware developers are familiar with it. As we can see on the GHDL contributors list, this argument has convinced no one. What made me feel overwhelmed was not Ada per se, but the infrastructure around Ada. There is only one IDE (GNAT Studio), which crashed for me from time to time. The whole build system is made with bash scripts and Makefiles, which are not easy to analyze. Ada is also not popular, if you encounter some problem not related with the code, it usually takes significant amount of time to fix, if googling for the problem returns nothing.

Then I took a look at rust_hdl. After 1 day I was analyzing the code, and actually understanding it. I know that GHDL is much bigger project, but in both cases I was interested in the parser. It took me around 2 weeks to get to know how rust_hdl works, how the vhdl_ls is implemented, and how it utilizes LSP protocol. I have even started working on adding support for comment parsing, but then I have discovered tree-sitter.

Writing parsers is hard. Writing parsers in imperative way is much harder. Writing parser in a declarative way is a bit simpler, but what is more important, you do not have to have very good knowledge of any programming language.

Now you may ask why not Antrl4. The generated parser might be faster than the one generated by the tree-sitter. But we do not have to be the fastest, we just need to be fast enough. In my opinion tree-sitter has other advantages, which in my humble opinion are more important than the speed.

tree-sitter is built with incremental parsing in mind. Even if the first parse is a bit slower, then the next one is probably much faster than the whole file reparsing. This may be advantage in case of LSP and disadvantage in case of compilation/synthesis, but read also point 5.
tree-sitter has built-in support for syntax highlighting. Syntax highlighting is probably one of the basic things you expect when you work with any language. You do not need to have different regex in different text editors. The source of highlighting rules is the same as the source of LSP/compilation/synthesis.
It has bindings for different languages and can be reused by people in different tools (Antrl4 probably also has such bindings).
tree-sitter syntax and infrastructure is easier to understand and get into than the one in Antrl4 (subjective opinion).
Lets assume, that the parser generated by the tree-sitter is slower than the one used in GHDL (from my rough estimations it looks like their speed may be similar, but I might be wrong). With tree-sitter you can run parsers in parallel. If I remember correctly GHDL is not capable of parallel parsing.

@eine has written:

@m-kru, based on previous discussions, using a "general purpose" parser generator for VHDL is a dead-end. It might be enough to parse the most common features of VHDL 1993, and maybe some of VHDL 2008. But there seems to be consensus in both VHDL and Verilog being complex enough to require custom parsing architectures. See October 18, 2019 1:32 AM and October 17, 2019 4:02 PM.

I have heard this argument a lot of times, but I have never seen any proof. On the contrary, @Nic30 Antrl4 work makes me think this is rather possible.

I am just sick of everyone seeing the problem, but doing nothing to solve it. The single common parser is a must if we want to unite as a community.

@Nic30 if you could help with tree-sitter for VHDL I would really appreciate this. If we succeed, we would prove they are wrong. If we failed, well we would at least know that there is no hope. Anyway, as you are experienced with Antrl4, I would really like to know what you think about my 'why tree-sitter' arguments.

eine commented 4 years ago

@m-kru, WRT rust_hdl vs GHDL, I believe that the language has little to do with the complexity. GHDL does much more than just parsing, and most of the additional complexity is related to that. However, I agree that tooling in Rust is much better, and that would be a strong point to rewrite GHDL (which is an astonishing effort).

Writing parsers is hard. Writing parsers in imperative way is much harder. Writing parser in a declarative way is a bit simpler, but what is more important, you do not have to have very good knowledge of any programming language.

I am using "parser" as a wide term that involves not only generating an AST but actually doing something with it. In order to do anything (which might be as simple as preserving/removing comments from a file), you need a very good knowledge of the language. Parsing is mostly solved in many languages. What we are lacking are tools that use those.

tree-sitter is built with incremental parsing in mind

IIRC, either rust_hdl or PyVHDLParser were also designed with incremental parsing in mind.

Lets assume, that the parser generated by the tree-sitter is slower than the one used in GHDL (from my rough estimations it looks like their speed may be similar, but I might be wrong). With tree-sitter you can run parsers in parallel. If I remember correctly GHDL is not capable of parallel parsing.

Olof proposed using rush_hdl as a frontend/parser for GHDL (or the other way round), in order to avoid implementing semantic analysis in Rust. The purpose was twofold. First, to obviously reuse the knowledge that is already buried into GHDL. Second, but most important, to prototype some plugin infrastructure for third-party projects to reuse GHDL. If that worked, it would be possible to migrate GHDL to Rust progressively. However, Tristan did some tests with Rust after that, and I think he was not convinced about it. I suggest you to ask him.

It is currently possible to use some of the features of GHDL through the Python wrapper (https://github.com/ghdl/ghdl/tree/master/python). I would love to see any effort for some external project to use that, or to trigger the development of a similar C/C++ interface to plug into GHDL.

I have heard this argument a lot of times, but I have never seen any proof. On the contrary, @Nic30 Antrl4 work makes me think this is rather possible.

The proof is that you tried GHDL and rust_hdl first, and not any other "generator based" solution which you could find more easily, which had better features, or which was more used. I believe that most parsers used by vendors are custom too. I don't mean to claim that Nic's ANTLR4 is a dead-end. But he already put +2y of effort into it and I'm not sure about how much he did need to customize it. Are you willing to invest a similar amount of time to have an equivalent solution based on tree-sitter?

I am just sick of everyone seeing the problem, but doing nothing to solve it. The single common parser is a must if we want to unite as a community.

Almost everyone is doing the best to contribute to the open source community. However, people have lives and families, and almost no one likes to be told what to do if they are not paid for it. If you know of someone who has the expertise + motivation + funding to work +6 months in such tasks, I'd be glad to provide all the references and knowledge I can gather.

However, note that you are doing the same you are criticising. Dozens of projects have been started in the last decades which aimed to provide the single common parser. Most of them were started from scratch, instead of building on existing solutions. Most of them were abandoned and forgotten, so much that you probably didn't find them and that's why you think that no one is doing anything to solve it:

I would honestly suggest you to take a step back and think about the problem from a different point of view. What are the features that you are missing and which cannot be currently done with existing parsers? Is the effort to enhance them significantly larger than writting your own tool from scratch?

Nic30 commented 4 years ago

@m-kru

'why tree-sitter' arguments.

I always wanted to try how incremental parsing works on larger scale with a complicated grammar like System Verilog it could be very tricky, because the preprocessor does not run in incremental mode and some code construct may require reparse of very large section of the text.

I am sure GHDL does not support parallelization, but that is not the case of antrl4 based tools.

I believe that the the area where the tree-sitter has strong advantage is integration with language servers etc. There are many example projects for such a thing. Antrl4 can be very fast because it uses state-of-the art methods for parsing. I have only seen the tree-sitter grammars where the performance was not the issue.

I am just sick of everyone seeing the problem, but doing nothing to solve it. The single common parser is a must if we want to unite as a community.

Also my words. I can help but we need a plan.

@eine

Looking at initial commit in hdlConvertor and it is actually 5 years. 8 years back I had a wish that I will find project like it one day after 3 years here was nothing and then I and several others started writing hdlConvertor. It still does not contains verification part of system verilog. And the vhdl/sv interpret is work in progress. I guess that it requires around 10K hours to make VHDL 2008, SV2017 out of it.

eine commented 4 years ago

@Nic30, I didn't want to dismiss your work at all. I said 2y as the coarse estimate of the minimum time to write a VHDL 2008 parser from scratch. In the last 5-8y you did many other things apart from that (which I find VERY exciting, since I had the same wish).

jeremiah-c-leary commented 3 years ago

@GlenNicholls , You should see what @qarlosalberto has done in issue #401 and issue #418. I think that is VSC.

jeremiah-c-leary commented 3 years ago

So...I am at the point where I would like to split myself from my current parser and use something else.

I am currently working on an issue to apply formatting rules to context clauses. I decided to try to create a different model of the VHDL file I am analyzing. I am using unique objects for the VHDL keywords etc... This might end up being my abstraction layer.

In order to add the rules for the context clauses I need a parser. I started to update mine to produce what I think I want for my abstraction layer. Now that I have something working I think now would be the time to try a different parser and see how I can use it's output.

Two questions:

1) which converter? hdlConverter tree-sitter version hdl_rust GHDL PyVHDLParser something else.

My only hard requirements are: 1) preserves white space 2) preserves comments 3) preserves case

Soft requirements: 1) prefer a non-compiled solution for portability 2) be able to include it in vsg directly

Any guidance on how to proceed would be greatly appreciated.

I was also wondering if writing a converter from the parsers output to my abstracted model would be just as much work as writing my own parser. Am I always bound to a particular parser?

It also seems like a big bang change also. Not something I can evolve over time.

jeremiah-c-leary commented 3 years ago

This might be a crazy idea, but could someone write a parser for the VHDL LRM that would automatically create a parser for VHDL code. Then when the LRM changes we get a new parser automatically. It would always be up to date and 100% accurate.

eine commented 3 years ago

@Paebbels, can PyVHDLParser currently fulfill the requirements above?

Any guidance on how to proceed would be greatly appreciated.

If possible, I'd try PyVHDLParser.
Otherwise, I guess that between hdl_rust or hdlConverter it is a matter of flavour (any of them might need some tweaks/enhancements).
GHDL is a very solid foundation, and many users will already have it installed. This can be an advantage compared to hdl_rust or hdlConverter. However, it would require some more effort. I would suggest asking @tgingold whether the python interface of GHDL (the one used by the LSP server) might be enough to build the features you need. If it works, you might get a "native" feeling.
- FWIW, GHDL does already have some styling features: https://ghdl.readthedocs.io/en/latest/using/CommandReference.html?#pretty-print-pp-html
tree-sitter-vhdl is still a few weeks old.

This might end up being my abstraction layer.

I was also wondering if writing a converter from the parsers output to my abstracted model would be just as much work as writing my own parser. Am I always bound to a particular parser?

On the one hand, I think that writing you own parser is much more work than converting the output (subset) of an existing parser and filtering/modifying it to suits your needs. This is specially so for future enhancements. Any future language feature that you might need to analyze is likely to be already handled by some parser.

On the other hand, I agree with you on being dependent on a particular parser. Ideally a "standard" or "common" AST would exist, so that a common interface exists between any parser and any tool such as this. However, ASTs are normally shaped after the architecture of the parser and not always stable (https://ghdl.readthedocs.io/en/latest/using/CommandReference.html?#cmdoption-ghdl-file-to-xml). I don't know if the ASTs of e.g. rust_hdl and GHDL can be modified to match each other. Apart from that, @Nic30 seems to have something named "universal HDL AST". A good starting point might be documenting and discussing that format.

This might be a crazy idea, but could someone write a parser for the VHDL LRM that would automatically create a parser for VHDL code. Then when the LRM changes we get a new parser automatically. It would always be up to date and 100% accurate.

Currently, the LRM is not machine-readable. There is on-going work in the VHDL Analysis and Standardization Group (VASG) to use LaTeX in the next version. Hopefully, new changes will be machine-readable and will include code examples (see https://github.com/VHDL/Compliance-Tests/tree/LCS2019/issues and https://github.com/VHDL/Compliance-Tests/issues/12#issuecomment-661056114). There is a mailing-list (called "reflector") and the group meets every two weeks. You can find the references in the wiki http://www.eda-twiki.org/cgi-bin/view.cgi/P1076/WebHome.

In fact, I would say that the idea of extracting the grammar from the LRM has been discussed in the last months. However, as commented above, I believe that defining the grammar is not the most important piece in the tooling.

Nic30 commented 3 years ago

@jeremiah-c-leary

could someone write a parser for the VHDL LRM that would automatically create a parser for VHDL

https://github.com/Nic30/VHDL_and_SV_formal_grammar_to_ANTLR4

(But I had to do some manual tweaking on the top of that)

m-kru commented 3 years ago

@eine

WRT rust_hdl vs GHDL, I believe that the language has little to do with the complexity. GHDL does much more than just parsing, and most of the additional complexity is related to that. However, I agree that tooling in Rust is much better, and that would be a strong point to rewrite GHDL (which is an astonishing effort).

I didn't want to suggest, that Ada is the source of complexity. I wanted to suggest that sometimes the infrastructure around the language is what makes the difference, not the language.

IIRC, either rust_hdl or PyVHDLParser were also designed with incremental parsing in mind.

I am 99% sure, that vhdl_ls utilizing rust_hdl reparses whole file on change.

Olof proposed using rush_hdl as a frontend/parser for GHDL (or the other way round), in order to avoid implementing semantic analysis in Rust. The purpose was twofold. First, to obviously reuse the knowledge that is already buried into GHDL. Second, but most important, to prototype some plugin infrastructure for third-party projects to reuse GHDL. If that worked, it would be possible to migrate GHDL to Rust progressively.

This is what I was also thinking about.

I would honestly suggest you to take a step back and think about the problem from a different point of view. What are the features that you are missing and which cannot be currently done with existing parsers? Is the effort to enhance them significantly larger than writting your own tool from scratch?

The problem is that different parsers are missing different features and each tool uses its own parser. If someone adds some feature to one of them, it is probably still missing in the other ones and someone has to implement the same thing in another parser. This is like multiplying bugs. What is more, with multiple parsers you need to learn how to interact with each of them, again extra time wasted.

@jeremiah-c-leary

This might be a crazy idea, but could someone write a parser for the VHDL LRM that would automatically create a parser for VHDL code.

I think this is almost impossible. VHDL is full of ambiguity. To do the full syntax parsing you need some of the semantic parsing. Also the LRM is written in such a way, that it adds extra conflicts, which in reality does not exist. To get to know these artificial conflicts you need to do semantic analysis of the standard ...

eine commented 3 years ago

The problem is that different parsers are missing different features and each tool uses its own parser. If someone adds some feature to one of them, it is probably still missing in the other ones and someone has to implement the same thing in another parser. This is like multiplying bugs. What is more, with multiple parsers you need to learn how to interact with each of them, again extra time wasted.

@m-kru, agree. However, I think that the best strategy to fix that is to work on it feature by feature. Naturally, one (you) could write a loooong list of desired features for some parser, then analyze each existing parser and publish some "comparison table" showing which features match, which are missing, etc. However, that is a LOT of work, because you need to learn about the internals of all the parsers. I believe that's why most users have traditionally decided to write their own tool from scratch, instead of doing that analysis effort. Maybe they did, but I had not found any article/writeup about such technical discussion. Of course, I blame no one; it is difficult to invest all that time for something that is not directly profitable; unless someone finds a way to make profit from the analysis/publication itself (is this publishable in Academia?).

So, my advice would be to slowly create such table, row by row. You pick a single feature, you do the analysis, you discuss with the maintainers of the existing parsers, and you reach a conclusion about the compatibility/incompatibility. If compatible, a common interface can be proposed (Jeremiah's abstraction layer). If not, that would point us to the areas where we should focus as a community.

Note that I totally understand if you don't want to do all that work, and your are entitled to try writting a different solution in the hope that it will be more attractive to other/new contributors. But the chances are that almost no one will join your project in the future, unless you do a HUGE effort to document it, provide code examples and even do some "marketing". See any of the existing parsers we have mentioned above.

Also, if you tackle it feature by feature, there are better chances for you to have your "real" work done. Trying to solve the "single common parser" problem, which is a very hard and long-going issue, will take much of your time from the projects that motivated you to look into parsers. Nevertheless, if building the common parser is your main project, that's awesome!

qarlosalberto commented 2 years ago

I have added full support for VSG in TerosHDL: https://terostechnology.github.io/terosHDLdoc/style/start.html#vsg-vhdl-style-guide

Formatter, linting, quick fixes...

GlenNicholls commented 2 years ago

@jeremiah-c-leary I saw that and it works quite nicely in teroshdl. I haven't been keeping up with this thread lately, but feel free to close this issue if you feel that the base requirements are met.

qarlosalberto commented 2 years ago

I have suggestions for a better integration. By I can open other issue.