hdl / pyHDLParser

Simple Python parser for extracting HDL (VHDL or Verilog) documentation
MIT License
16 stars 9 forks source link

VHDL: parsing component declarations inside entities #7

Open vvvverre opened 2 years ago

vvvverre commented 2 years ago

This bug was passed to me by someone else, but I am posting it as an issue so we can track it.

Currently pyHDLparser does not parse component declarations inside the architecture of an entity correctly. I have attached below a simple example of a VHDL file that recreates this bug.

library ieee;
use ieee.std_logic_1164.all;

entity three_bit_adder is
    port (
        a       : in std_logic_vector(2 downto 0);
        b       : in std_logic_vector(2 downto 0);
        cin     : out std_logic;
        s       : out std_logic_vector(2 downto 0);
        cout    : out std_logic
end entity three_bit_adder;

architecture arch of three_bit_adder is

    component full_adder is
        port (
            x, y, ci: in std_logic;
            q, co: out std_logic
    end component; 

    signal c: std_logic_vector(3 downto 0);

    c(0) <= cin;

    FA0: full_adder port map(a => a(0), b => b(0), cin => c(0), cout => c(1));
    FA1: full_adder port map(a => a(1), b => b(1), cin => c(1), cout => c(2));
    FA2: full_adder port map(a => a(2), b => b(2), cin => c(2), cout => c(3));

    cout <= c(3);

end arch;

I used the following python code to recreate the bugs.

import hdlparse.vhdl_parser as vhdl

vhdl_code = """

vhdl_ex = vhdl.VhdlExtractor()
vhdl_objs = vhdl_ex.extract_objects_from_source(vhdl_code)

for o in vhdl_objs:

The output I would expect for this is as follows:

VHDL entity: three_bit_adder
    a (<class 'str'>), VhdlParameterType('std_logic_vector','(2 downto 0)') (<class 'hdlparse.vhdl_parser.VhdlParameterType'>)
    b (<class 'str'>), VhdlParameterType('std_logic_vector','(2 downto 0)') (<class 'hdlparse.vhdl_parser.VhdlParameterType'>)
    cin (<class 'str'>), VhdlParameterType('std_logic','') (<class 'hdlparse.vhdl_parser.VhdlParameterType'>)
    s (<class 'str'>), VhdlParameterType('std_logic_vector','(2 downto 0)') (<class 'hdlparse.vhdl_parser.VhdlParameterType'>)
    cout (<class 'str'>), VhdlParameterType('std_logic','') (<class 'hdlparse.vhdl_parser.VhdlParameterType'>)
VHDL component: full_adder
    x (<class 'str'>), VhdlParameterType('std_logic','') (<class 'hdlparse.vhdl_parser.VhdlParameterType'>)
    y (<class 'str'>), VhdlParameterType('std_logic','') (<class 'hdlparse.vhdl_parser.VhdlParameterType'>)
    ci (<class 'str'>), VhdlParameterType('std_logic','') (<class 'hdlparse.vhdl_parser.VhdlParameterType'>)
    q (<class 'str'>), VhdlParameterType('std_logic','') (<class 'hdlparse.vhdl_parser.VhdlParameterType'>)
    co (<class 'str'>), VhdlParameterType('std_logic','') (<class 'hdlparse.vhdl_parser.VhdlParameterType'>)

Instead the output is as follows:

VHDL entity: three_bit_adder
    a (<class 'str'>), VhdlParameterType('std_logic_vector','(2 downto 0)') (<class 'hdlparse.vhdl_parser.VhdlParameterType'>)
    b (<class 'str'>), VhdlParameterType('std_logic_vector','(2 downto 0)') (<class 'hdlparse.vhdl_parser.VhdlParameterType'>)
    cin (<class 'str'>), VhdlParameterType('std_logic','') (<class 'hdlparse.vhdl_parser.VhdlParameterType'>)
    s (<class 'str'>), VhdlParameterType('std_logic_vector','(2 downto 0)') (<class 'hdlparse.vhdl_parser.VhdlParameterType'>)
    cout (<class 'str'>), VhdlParameterType('std_logic','') (<class 'hdlparse.vhdl_parser.VhdlParameterType'>)
    x (<class 'str'>), VhdlParameterType('std_logic','') (<class 'hdlparse.vhdl_parser.VhdlParameterType'>)
    y (<class 'str'>), VhdlParameterType('std_logic','') (<class 'hdlparse.vhdl_parser.VhdlParameterType'>)
    ci (<class 'str'>), VhdlParameterType('std_logic','') (<class 'hdlparse.vhdl_parser.VhdlParameterType'>)
    q (<class 'str'>), VhdlParameterType('std_logic','') (<class 'hdlparse.vhdl_parser.VhdlParameterType'>)
    co (<class 'str'>), VhdlParameterType('std_logic','') (<class 'hdlparse.vhdl_parser.VhdlParameterType'>)

There are two issues here:

  1. The component declaration for the component full_adder is never recognized
  2. The ports of the full_adder component are added to the ports of the entity

I believe the first issue is caused by an incorrect regex, which would only trigger on the syntax end three_bit_adder; but not end entity three_bit_adder;. I think this can be fixed like this:

@@ -87,7 +94,7 @@ vhdl_tokens = {
     'entity': [
         (r'generic\s*\(', None, 'generic_list'),
         (r'port\s*\(', None, 'port_list'),
-        (r'end\s+\w+\s*;', 'end_entity', '#pop'),
+        (r'end\s+(entity\s+)?\w+\s*;', 'end_entity', '#pop'),
         (r'/\*', 'block_comment', 'block_comment'),
         (r'--.*\n', None),

This changes the output to:

VHDL entity: three_bit_adder
    a (<class 'str'>), VhdlParameterType('std_logic_vector','(2 downto 0)') (<class 'hdlparse.vhdl_parser.VhdlParameterType'>)
    b (<class 'str'>), VhdlParameterType('std_logic_vector','(2 downto 0)') (<class 'hdlparse.vhdl_parser.VhdlParameterType'>)
    cin (<class 'str'>), VhdlParameterType('std_logic','') (<class 'hdlparse.vhdl_parser.VhdlParameterType'>)
    s (<class 'str'>), VhdlParameterType('std_logic_vector','(2 downto 0)') (<class 'hdlparse.vhdl_parser.VhdlParameterType'>)
    cout (<class 'str'>), VhdlParameterType('std_logic','') (<class 'hdlparse.vhdl_parser.VhdlParameterType'>)

Which is more correct, but still misses the component declaration. Adding the following token gives the expected output:

@@ -95,6 +102,7 @@ vhdl_tokens = {
         (r'end\s+\w+\s*;', 'end_arch', '#pop'),
         (r'/\*', 'block_comment', 'block_comment'),
         (r'type\s+(\w+)\s*is', 'type', 'type_decl'),
+        (r'component\s+(\w+)\s*is', 'component', 'component'),
         (r'--.*\n', None),
     'generic_list': [

One outstanding question I have is whether there should be some way of indicating that the VHDL component is declared inside the architecture?

vvvverre commented 2 years ago

I forgot to mention I have a branch in my fork with the potential solutions I mentioned above, in case anyone would like to test them: https://github.com/vvvverre/pyHDLParser/tree/fix_component_in_entity

vvvverre commented 2 years ago

I also just realised the code for handling architectures has the same bug:

@@ -92,7 +99,7 @@ vhdl_tokens = {
         (r'--.*\n', None),
     'architecture': [
-        (r'end\s+\w+\s*;', 'end_arch', '#pop'),
+        (r'end\s+(architecture\s+)?\w+\s*;', 'end_arch', '#pop'),
         (r'/\*', 'block_comment', 'block_comment'),
         (r'type\s+(\w+)\s*is', 'type', 'type_decl'),
         (r'component\s+(\w+)\s*is', 'component', 'component'),
umarcor commented 2 years ago

@vvvverre maybe you can create a PR from branch fix_component_in_entity, and add the code example above as a test file?