dalance / sv-parser

SystemVerilog parser library fully compliant with IEEE 1800-2017
Other
383 stars 49 forks source link

Compiler directives not printed via syntax_tree.to_string() #77

Closed philipaxer closed 1 year ago

philipaxer commented 1 year ago

It looks as if the compiler directives are implemented as whitespaces in the parser. That seems to prevent them from getting printed.

This code

    #[test]
    fn test_module() {
        let src = r##"
        `timescale 10ns/1ns
        "##;

        let (syntax_tree, _) = parse_sv_str(src, PathBuf::from(""), &HashMap::new(), &[""], false, false).unwrap();
        let mut ela = ElaborationUnit::new();
        ela.elaborate(&syntax_tree);

        print!("{}", syntax_tree.to_string());

    }

yields the following print out, which forgets about the compiler directive:

SourceText
   UnsignedNumber
    Token: '10' @ line:2
   TimeUnit
    Keyword
     Token: 'ns' @ line:2
   Symbol
    Token: '/' @ line:2
   UnsignedNumber
    Token: '1' @ line:2
   TimeUnit
    Keyword
     Token: 'ns' @ line:2
philipaxer commented 1 year ago

Actually the specific issue i have is to get notified about compiler directives (specifically timescale). I have seen that Complier Directives are inside a Whitespace enum.

So essentially I am not so sure how to handle compiler directives while descending the AST. The whitespaces can happen anywhere and it feels weird to have a match statement for each possible symbol/keyword etc. I am also relatively new to rust, so perhaps i am just seeing the obvious.

I am trying to elaborate the entire design and the way how i currently descent the AST is to look at a specific Node and then match all possible insides . The reason to do so is to bail out when i see unssupported features. Example below:

    fn elaborate_module(&mut self, syntax_tree: &SyntaxTree, node : &ModuleDeclaration) -> Result<Module> {
        let module_loc = unwrap_locate!(node).unwrap();

        match node {
            ModuleDeclaration::Ansi(module_node) => {return elaborate_module_ansi(syntax_tree, module_node);}
            ModuleDeclaration::Nonansi(module_node) => {return elaborate_module_nonansi(syntax_tree, module_node);}
            _ => {unimplemented!("Module declaration {:?} is not implemented", node);}
        }

        Err(ElaborationError::IllegalModuleDefinition)
    }
DaveMcEwan commented 1 year ago

It's not obvious :p

The function parse_sv_str operates in three main stages:

  1. The preprocessor language is parsed here to give a concrete syntax tree called pp_text.
  2. Preprocessor semantics are applied here to process that pp_text into a PreprocessedText structure which is a string with some metadata about ranges.
  3. That string is parsed here to form the concrete syntax_tree you're working with.

The specification of the preprocessor is a bit fuzzy but, essentially, the LRM specifies two languages (preprocessor, everything else), and the compiler directives are lumped in with the preprocesor, see IEEE1800-2017 clause 22. Sv-parser approaches this by keeping all preprocessor and compiler directives in WhiteSpace nodes. Directives are not included in the BNF (IEEE1800-2017 Annex A), and the rules about where they can be placed are not entirely formalised, so sv-parser just keeps them all together for "later" processing. It's not ideal, but good enough IMO.

To print every node (including WhiteSpace nodes), you can use the Debug trait like this, which is implemented here.

To handle compiler directives like timescale, you can match RefNode::WhiteSpace(WhiteSpace::CompilerDirective(d)) like this and use string comparison to find the keyword "timescale" or whatever. There are not many directives to consider, so your code (hopefully) shouldn't be too massive:

I wouldn't be too worried about performance around that processing, because so sane codebase is going to be using large numbers of directives... I may be wrong :p

Is that helpful?

DaveMcEwan commented 1 year ago

To directly address this issue's title: It's correct that syntax_tree.to_string() does not print compiler directives, but that's by design (not a bug). In Rust lingo, to_string is implemented by the Display trait, but to get compiler directives you need to use the Debug trait.

philipaxer commented 1 year ago

That makes sense