renatahodovan / grammarinator

ANTLR v4 grammar-based test generator
Other
348 stars 62 forks source link

Can max depth >20? #211

Closed Roise-yue closed 1 month ago

Roise-yue commented 10 months ago

When I was using this tool, I found that when the maximum depth is greater than 20, the program will crash (the tool will get stuck), so I would like to ask, what is the maximum depth setting that cannot exceed? In addition, when generating C language programs using C grammar, I found that some result files are empty. I would like to ask if it is because I set the maximum depth to 20, and when the syntax rules are derived to the 20th layer, they have not yet reached the non-terminal. I would like to reproduce the entire process of testing Jerryscript in your paper. Could you please guide me on the replication process?

renatahodovan commented 6 months ago

Hi @Roise-yue !

I'm sorry for the late reply. The generator must not get stuck or crash when setting the recursion limit to ~20 or so. Crash usually happens if depth is not set (especially if there is recursion in the grammar) and Python reaches the system recursion limit (which can be raised with the --sys-recursion-limit CLI argument of grammarinator-generate). However, it's true that the higher the allowed tree depth, the greater the chance of recursions or simply the generation of large trees, as there is no default limit on the width of the tree. If you want to have more control on the size of the output tree, then I'd suggest to play with the --max-tokens argument - with or without the --max-depth argument. This will set an upper limit on the number of the tokens in the output.

The empty output files are probably the result of the grammar. If you use the official C grammar from grammars-v4, then the start rule is:

compilationUnit
    : translationUnit? EOF
    ;

which means that there is 50% percent of generating translationUnit and 50% of omitting it. You can either change the grammar by removing the ? or creating a custom model to ensure that the decision of this ? option will always be evaluated to true (you can see examples here).

I would like to ask if it is because I set the maximum depth to 20, and when the syntax rules are derived to the 20th layer, they have not yet reached the non-terminal.

This cannot happen, since Grammarinator will increase maximum depth to the minimal value needed to finish the generation.

Could you please guide me on the replication process?

It was some time ago, but AFAIK it contained:

Roise-yue commented 5 months ago

Thank you very much for your reply. As I did not use the tool later and did not follow your messages in a timely manner, I still appreciate your detailed answer.