renatahodovan / grammarinator

ANTLR v4 grammar-based test generator
Other
349 stars 62 forks source link

undefined variables #150

Closed Roise-yue closed 10 months ago

Roise-yue commented 1 year ago

Will the program generated by this tool have undefined variables?

renatahodovan commented 1 year ago

@Roise-yue Grammarinator has no such concept as "variables" (or even as "program"). Instead, it interprets an arbitrary language description written in ANTLRv4 grammar format and generates output according to this grammar. This output can be a program, if the input grammar was Cpp, Java, JavaScript, etc., but it can be arbitrary textual data in any other language format (you can find existing examples here).

If your goal is to generate output in a programming language and you need variable matching, then you have to customize the Generator subclass produced by Grammarinator. For this, you can create a subclass of the produced generator and override the rule responsible for identifier generation. Or, you can define custom models or transformers to ensure variable matching. See the models and transformers sections of the documentation for details.

Roise-yue commented 1 year ago

I am using the grammarinator tool to generate Python programs through the Python grammar of antlr4, but I encountered two problems. Firstly, when I input tiny-Python grammar, the generated program contains variables that are referenced uninitialized, such as "while a+2" but "a" Not initialized. Secondly, when I used the complete grammar of Python 3 as input, Unicode expressions starting with "\p" appeared in the grammar file, which is not supported by the grammarinator tool. May I ask if you have any good suggestions for these two questions.

in Python3Lexer.g4:

fragment IDSTART : '' | [\p{L}] | [\p{Nl}] //| [\p{Other_ID_Start}] | UNICODE_OIDS ;

/// id_continue ::= <all characters in id_start, plus characters in the categories Mn, Mc, Nd, Pc and others with the Other_ID_Continue property> fragment ID_CONTINUE : ID_START | [\p{Mn}] | [\p{Mc}] | [\p{Nd}] | [\p{Pc}] //| [\p{Other_ID_Continue}] | UNICODE_OIDC ;

Related Codes in grammarinator: if escaped in ('p', 'P'): raise ValueError('Unicode properties (\p{...}) are not supported')

renatahodovan commented 10 months ago

@Roise-yue As per the first problem about variable matching in the generated output, see my previous comment.

Secondly, unicode properties starting with \p are supported in the latest master since https://github.com/renatahodovan/grammarinator/commit/869828ccdad1fd99cc4ed4cb13d6a72eb66045f1.

I hope it helps!