antlr / antlr4

ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.
http://antlr.org
BSD 3-Clause "New" or "Revised" License
17.26k stars 3.29k forks source link

Antlr4 (target Python 3) generated files and mypy (>= 0.91) type-checking issues #4183

Open DagSonntag opened 1 year ago

DagSonntag commented 1 year ago

Hi I have been using mypy together with Antlr4 python generated code for some time. This has mainly worked since I have been able to exclude any Antlr4 generated files from the mypy-check since they have never been fully compliant. However, lately, from mypy 0.91 forward, some errors are raised even for files that are excluded in case it affects files dependent on the excluded files. This, in combination with that the Antlr generated files are massive, creates an issue since the errors from those files bloats the log of any mypy type-checking, making it hard to see the errors in ones own code.

This is certainly an issue for mypy, but I am also wondering if it is desirable to make Antrl4 python generated code more mypy (PEP) typing-friendly? The main issues I have found are (examples from the Java/Java grammar):

(Edit: Two additional issues were also raised previously, but are now moved to separate issues (#4188 #4189 ))

So, my question is. Would these fixes be desirable or have there been some discussions to what level/PEP the generated files should be follow? (i.e. Is this seen as an issue or will provided fixes be rejected)

Last of all; Thank you for a great tool! It is most useful! :)

ericvergnaud commented 1 year ago

Hi,

thanks for this.

I welcome efforts to improve the usability of generated code if it’s not tied to a particular tool.

There are 3 topics here, and maybe they would be better dressed in 3 separate conversations, except the 1st one.

Re usage of for imports, with the current tool’s infrastructure, it is impossible to know exactly which set of classes need to be imported at the time the import statement is generated, because it depends on the grammar. As an example based on your own proposals, import Optional from Typing would only be necessary if the grammar has a repeating rule (such as id+ or id). Bad news we don’t know if that’s the case when generating the import, and if it’s not you’ll get a warning for an unused import.

So the strategy here would rather be to disable PEP import related warnings as part of generating the code (if that’s possible)

Would you mind opening separate threads for the 2 other topics ?

Eric

Le 15 mars 2023 à 15:07, Dag Sonntag @.***> a écrit :

Hi I have been using mypy together with Antlr4 python generated code for some time. This has mainly worked since I have been able to exclude any Antlr4 generated files from the mypy-check since they have never been fully compliant. However, lately, from mypy 0.91 forward, some errors are raised even for files that are excluded in case it affects files dependent on the excluded files. This, in combination with that the Antlr generated files are massive, creates an issue since the errors from those files bloats the log of any mypy type-checking, making it hard to see the errors in ones own code.

This is certainly an issue for mypy, but I am also wondering if it is desirable to make Antrl4 python generated code more mypy (PEP) typing-friendly? The main issues I have found are (examples from the Java/Java grammar):

The usage of * imports.

  • In for example JavaParser the first import statement is from antlr4 import *. This is something mypy does not like, and results in a lot of 'name-defined' errors like JavaParser.py:11145: error: Name "ParseTreeListener" is not defined [name-defined]
  • Could be resolved by explicitly importing the classes instead, replacing the * The usage of implicit optional
  • Implicit optional is used all over the place in the method declarations in the Antlr generated files with for example: def variableModifier(self, i:int=None):. This is not allowed since pep484 and while one can turn off this check in mypy, it seems like an easy fix to just add Optional everywhere (i.e. def variableModifier(self, i:Optional[int]=None): with an extra import of Optional from Typing). Slots seems to be misused in the Antrl generated files.
  • Examples are JavaParser.py:10217: error: Trying to assign name "bop" that is not in "slots" of type "JavaParser.JavaParser.ExpressionContext" [misc]
  • Would probably require more work, but these cases should be possible to find and add the appropriate values in slots So, my question is. Would these fixes be desirable or have there been some discussions to what level/PEP the generated files should be follow? (i.e. Is this seen as an issue or will provided fixes be rejected)

Last of all; Thank you for a great tool! It is most useful! :)

— Reply to this email directly, view it on GitHub https://github.com/antlr/antlr4/issues/4183, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAZNQJGKRGBGS7YXZRWQC5LW4HEKJANCNFSM6AAAAAAV337HUE. You are receiving this because you are subscribed to this thread.

DagSonntag commented 1 year ago

Hi Thank you for the fast response. Hmm, ok, I haven't looked into the code yet, but there would seem to me like there are two ways to do it

Then do a post-edit of the first lines of the generated document/file. You don't think any of those strategies are possible with the current code generation flow? Or do the lines in the file have to be generated in order?

ericvergnaud commented 1 year ago

Hi, Sorry for the late reply I think both of these would be very costly in code changes, and thus risky.