antlr / grammars-v4

Grammars written for ANTLR v4; expectation that the grammars are free of actions.
MIT License
10.23k stars 3.71k forks source link

SystemVerilog Preprocessor directives cause ast to be empty #4295

Open ccatrett opened 3 weeks ago

ccatrett commented 3 weeks ago

Minimal examples Consider the following .svh files:

`ifndef ___SYS_DEFS_SVH__
`define ___SYS_DEFS_SVH__
`timescale 1ns/100ps
`define FALSE 1'h0
`define TRUE 1'h1
typedef logic [31:0] ADDR;
`endif
`define ___SYS_DEFS_SVH__
`timescale 1ns/100ps
`define FALSE 1'h0
`define TRUE 1'h1

`ifdef ___SYS_DEFS_SVH__
    typedef logic [31:0] ADDR;
`endif

Both files result in the following AST: (source_text <EOF>)

This should not be the case for either of these two files.

I believe from viewing the .g4 files, it seems like there is support for preprocessor definitions.

Here is the Python script I am running to generate this output:

import sys

from antlr4 import *
from SystemVerilogLexer import SystemVerilogLexer
from SystemVerilogParser import SystemVerilogParser

def main(input_file):
    input_stream = FileStream(input_file)
    lexer = SystemVerilogLexer(input_stream)
    token_stream = CommonTokenStream(lexer)
    parser = SystemVerilogParser(token_stream)

    tree = parser.source_text()

    print(tree.toStringTree(recog=parser))

if __name__ == '__main__':
    input_file = sys.argv[1]
    main(input_file)

I would appreciate any help to get these cases working. Thank you.

msagca commented 3 weeks ago

Hi @ccatrett, Tokens associated with the preprocessor directives are stored in a separate channel. If the input only consists of directives, the default channel will contain nothing (except the EOF token) and you will see an empty syntax tree. You need to specify the directives channel by its ID when creating a token stream and pass it to a SystemVerilogPreParser instance.