Open sk- opened 6 years ago
If you're doing refactoring, wouldn't it be better to use lib2to3's AST? I've got something that resolves the non-dynamic names in lib2to3's AST and am (slowly) working on resolving the dynamic names (e.g., y
in x = MyClass(); x.y
) and imported names (just a "small matter of programming").
I'm interested in this - particularly for convenient inspection of source code in an IDE. Most other typed languages can support "type reveal on hover" for expressions.
This is easier said than done. Of course it would be helpful if it magically existed, but this is not a simple addition to the stdlib ast module.
On Sun, Jun 10, 2018 at 3:10 PM Daniel notifications@github.com wrote:
I'm interested in this - particularly for convenient inspection source code from an IDE. Most other typed languages support a "type reveal on hover" for expressions.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/python/mypy/issues/4868#issuecomment-396086034, or mute the thread https://github.com/notifications/unsubscribe-auth/ACwrMnsV9tamGE_09Ro8wTEGyFtWVZVOks5t7Zk2gaJpZM4TKh4B .
-- --Guido van Rossum (python.org/~guido)
Do you need an annotated AST, or is it sufficient to have the tokens in the file mapped to fully qualified names (FQNs) and types? I have some code for mapping all the tokens to FQNs and partial code for their type information (the main thing missing is for imports, which I'm working on but it's summer and I'm doing it in my spare time).
@kamahen What do you mean by tokens? I'm guessing you refer to names/bindings or do you refer to the tokens as outputted by the tokenize
module.
That would still be very helpful, as it would allow to easily refactor calls to a modules without having to make assumptions on how it is imported.
@sk- Yes, it's more-or-less the tokens as output by the tokenize
module, although I use lib2to3
, which has its own tokenizer. My output is a simplified AST with the tokens and fully qualified names. It would probably be easy to merge this information back into the original AST by doing a simple tree traversal of the AST while progressing through the list of tokens with their FQNs.)
Eventually, I hope to have inferred types with all the tokens, but that code is currently missing some features, such as proper handling of import
.
If you want to play around with my code, I can give you my latest version (which isn't yet on github).
Are you planning on using the AST in lib2to3
, which has the source location information in its AST, or something else?
@kamahen That'd be perfect, as I'm also using lib2to3
.
I'd be happy to play with the code you have so far.
OK, let me get things into a slightly better shape, then I'll send it to you (or put it on github if it's not too awful). This week is rather busy, but hopefully some time next week.
There are two parts of the code -- the first produces a simplified AST with fully qualified names (currently, it outputs in JSON); the second takes that simplified AST and figures out how to resolve .
operations (which is what you want for your smarter refactoring). Most of the resolving logic is done, except for handling imports. The output is also JSON, in a somewhat unfriendly format. But that can be easily changed.
(BTW, the most expensive part of the code seems to be the JSON marshaling/unmarshaling).
I've pushed an interim version of my code to https://github.com/kamahen/pykythe
Its outputs will take some explaining, so (assuming you want to play with it), I suggest you follow the setup instructions and run the test (make all_tests all_test2
). At that point, I can tell you what to look at and how to interpret the outputs (and, if you wish, I can probably produce the outputs in a different and easier to use format ... for example, if you only want to use the fully-qualified name outputs, there's a simpler way of running the code).
The main things that are missing:
I'm closing this since there is no concrete proposal and there doesn't seem to be much active interest in this issue.
Just wanted to mention that [LibCST](https://github.com/Instagram/LibCST)
has a TypeInferenceProvider
which uses Pyre's query functionality. See https://pyre-check.org/docs/querying-pyre.html and https://libcst.readthedocs.io/en/latest/metadata.html#type-inference-metadata.
It'd be great if we could use Mypy instead of Pyre.
Hm, dmypy (mypy's daemon mode) has much of the same information available, there's just no API for it yet. We do have an experimental API that suggests the signature for an unannotated function based on how it's called (dmypy suggest
).
Maybe we should develop something similar to Pyre's API? Maybe we could even just copy the same API style, to make it easier for clients to switch.
Okay, let's open this since there is renewed interest.
Maybe we should develop something similar to Pyre's API? Maybe we could even just copy the same API style, to make it easier for clients to switch.
This would a reasonable thing to have. At least most of the Pyre API features should be easy enough to implement on top of dmypy.
The core team doesn't have a lot of spare cycles, but if somebody wants to look into this, I'm happy to give some help.
I have made a tool to enhance ast
with metadata from mypy
:
import sys
a = 1
b = 2
print(a is b)
Output:
» typed-linter ex.py
Original AST:
Module(body=[Import(names=[alias(name='sys')]), Assign(targets=[Name(id='a', ctx=Store())], value=Constant(value=1)), Assign(targets=[Name(id='b', ctx=Store())], value=Constant(value=2)), Expr(value=Call(func=Name(id='print', ctx=Load()), args=[Compare(left=Name(id='a', ctx=Load()), ops=[Is()], comparators=[Name(id='b', ctx=Load())])], keywords=[]))], type_ignores=[])
Format:
-- ast.Node mypy.Node
metdata
-- <class 'ast.Module'> <class 'mypy.nodes.MypyFile'>
{'fullname': 'ex', 'is_stub': False, 'path': 'ex.py', 'is_partial_stub_package': False, 'is_package_init_file': False, 'names': {'__builtins__': {'.class': 'SymbolTableNode', 'kind': 'Gdef', 'cross_ref': 'builtins', 'type': None}, '__name__': {'.class': 'SymbolTableNode', 'kind': 'Gdef', 'cross_ref': 'ex.__name__', 'type': 'builtins.str'}, '__doc__': {'.class': 'SymbolTableNode', 'kind': 'Gdef', 'cross_ref': 'ex.__doc__', 'type': 'builtins.str'}, '__file__': {'.class': 'SymbolTableNode', 'kind': 'Gdef', 'cross_ref': 'ex.__file__', 'type': 'builtins.str'}, '__package__': {'.class': 'SymbolTableNode', 'kind': 'Gdef', 'cross_ref': 'ex.__package__', 'type': 'builtins.str'}, 'sys': {'.class': 'SymbolTableNode', 'kind': 'Gdef', 'module_hidden': True, 'module_public': False, 'cross_ref': 'sys', 'type': None}, 'a': {'.class': 'SymbolTableNode', 'kind': 'Gdef', 'cross_ref': 'ex.a', 'type': 'builtins.int'}, 'b': {'.class': 'SymbolTableNode', 'kind': 'Gdef', 'cross_ref': 'ex.b', 'type': 'builtins.int'}}, 'imports': [{'is_unreachable': False, 'is_top_level': True, 'is_mypy_only': False, 'assignments': [], 'class_name': 'Import', 'ids': [{'imported': 'sys', 'alias': None}]}]}
-- <class 'ast.Import'> <class 'mypy.nodes.Import'>
{'is_unreachable': False, 'is_top_level': True, 'is_mypy_only': False, 'assignments': [], 'class_name': 'Import', 'ids': [{'imported': 'sys', 'alias': None}]}
-- <class 'ast.Assign'> <class 'mypy.nodes.AssignmentStmt'>
{'type': builtins.int, 'unanalyzed_type': None, 'new_syntax': False, 'is_alias_def': False, 'is_final_def': False}
-- <class 'ast.Name'> <class 'mypy.nodes.NameExpr'>
{'name': 'a', 'fullname': 'ex.a', 'kind': 1, 'is_new_def': True, 'is_special_form': False, 'is_inferred_def': False, 'is_alias_rvalue': False, 'type': builtins.int}
-- <class 'ast.Constant'> <class 'mypy.nodes.IntExpr'>
{'value': 1, 'type': Literal[1]?}
-- <class 'ast.Assign'> <class 'mypy.nodes.AssignmentStmt'>
{'type': builtins.int, 'unanalyzed_type': None, 'new_syntax': False, 'is_alias_def': False, 'is_final_def': False}
-- <class 'ast.Name'> <class 'mypy.nodes.NameExpr'>
{'name': 'b', 'fullname': 'ex.b', 'kind': 1, 'is_new_def': True, 'is_special_form': False, 'is_inferred_def': False, 'is_alias_rvalue': False, 'type': builtins.int}
-- <class 'ast.Constant'> <class 'mypy.nodes.IntExpr'>
{'value': 2, 'type': Literal[2]?}
-- <class 'ast.Expr'> <class 'mypy.nodes.ExpressionStmt'>
{}
-- <class 'ast.Call'> <class 'mypy.nodes.CallExpr'>
{'arg_kinds': [0], 'arg_names': [None], 'is_analyzed': False}
-- <class 'ast.Name'> <class 'mypy.nodes.NameExpr'>
{'name': 'print', 'fullname': 'builtins.print', 'kind': 1, 'is_new_def': False, 'is_special_form': False, 'is_inferred_def': False, 'is_alias_rvalue': False, 'type': def (*values: builtins.object, *, sep: Union[builtins.str, None] =, end: Union[builtins.str, None] =, file: Union[_typeshed.SupportsWrite[builtins.str], None] =, flush: builtins.bool =)}
-- <class 'ast.Compare'> <class 'mypy.nodes.ComparisonExpr'>
{'operators': ['is'], 'method_types': [None, None, None], 'type': builtins.bool}
-- <class 'ast.Name'> <class 'mypy.nodes.NameExpr'>
{'name': 'a', 'fullname': 'ex.a', 'kind': 1, 'is_new_def': False, 'is_special_form': False, 'is_inferred_def': False, 'is_alias_rvalue': False, 'type': builtins.int}
-- <class 'ast.Name'> <class 'mypy.nodes.NameExpr'>
{'name': 'b', 'fullname': 'ex.b', 'kind': 1, 'is_new_def': False, 'is_special_form': False, 'is_inferred_def': False, 'is_alias_rvalue': False, 'type': builtins.int}
I am going to release it soon.
@sobolevn did you release that?
I have made a tool to enhance
ast
with metadata frommypy
:import sys a = 1 b = 2 print(a is b)
Output:
» typed-linter ex.py Original AST: Module(body=[Import(names=[alias(name='sys')]), Assign(targets=[Name(id='a', ctx=Store())], value=Constant(value=1)), Assign(targets=[Name(id='b', ctx=Store())], value=Constant(value=2)), Expr(value=Call(func=Name(id='print', ctx=Load()), args=[Compare(left=Name(id='a', ctx=Load()), ops=[Is()], comparators=[Name(id='b', ctx=Load())])], keywords=[]))], type_ignores=[]) Format: -- ast.Node mypy.Node metdata -- <class 'ast.Module'> <class 'mypy.nodes.MypyFile'> {'fullname': 'ex', 'is_stub': False, 'path': 'ex.py', 'is_partial_stub_package': False, 'is_package_init_file': False, 'names': {'__builtins__': {'.class': 'SymbolTableNode', 'kind': 'Gdef', 'cross_ref': 'builtins', 'type': None}, '__name__': {'.class': 'SymbolTableNode', 'kind': 'Gdef', 'cross_ref': 'ex.__name__', 'type': 'builtins.str'}, '__doc__': {'.class': 'SymbolTableNode', 'kind': 'Gdef', 'cross_ref': 'ex.__doc__', 'type': 'builtins.str'}, '__file__': {'.class': 'SymbolTableNode', 'kind': 'Gdef', 'cross_ref': 'ex.__file__', 'type': 'builtins.str'}, '__package__': {'.class': 'SymbolTableNode', 'kind': 'Gdef', 'cross_ref': 'ex.__package__', 'type': 'builtins.str'}, 'sys': {'.class': 'SymbolTableNode', 'kind': 'Gdef', 'module_hidden': True, 'module_public': False, 'cross_ref': 'sys', 'type': None}, 'a': {'.class': 'SymbolTableNode', 'kind': 'Gdef', 'cross_ref': 'ex.a', 'type': 'builtins.int'}, 'b': {'.class': 'SymbolTableNode', 'kind': 'Gdef', 'cross_ref': 'ex.b', 'type': 'builtins.int'}}, 'imports': [{'is_unreachable': False, 'is_top_level': True, 'is_mypy_only': False, 'assignments': [], 'class_name': 'Import', 'ids': [{'imported': 'sys', 'alias': None}]}]} -- <class 'ast.Import'> <class 'mypy.nodes.Import'> {'is_unreachable': False, 'is_top_level': True, 'is_mypy_only': False, 'assignments': [], 'class_name': 'Import', 'ids': [{'imported': 'sys', 'alias': None}]} -- <class 'ast.Assign'> <class 'mypy.nodes.AssignmentStmt'> {'type': builtins.int, 'unanalyzed_type': None, 'new_syntax': False, 'is_alias_def': False, 'is_final_def': False} -- <class 'ast.Name'> <class 'mypy.nodes.NameExpr'> {'name': 'a', 'fullname': 'ex.a', 'kind': 1, 'is_new_def': True, 'is_special_form': False, 'is_inferred_def': False, 'is_alias_rvalue': False, 'type': builtins.int} -- <class 'ast.Constant'> <class 'mypy.nodes.IntExpr'> {'value': 1, 'type': Literal[1]?} -- <class 'ast.Assign'> <class 'mypy.nodes.AssignmentStmt'> {'type': builtins.int, 'unanalyzed_type': None, 'new_syntax': False, 'is_alias_def': False, 'is_final_def': False} -- <class 'ast.Name'> <class 'mypy.nodes.NameExpr'> {'name': 'b', 'fullname': 'ex.b', 'kind': 1, 'is_new_def': True, 'is_special_form': False, 'is_inferred_def': False, 'is_alias_rvalue': False, 'type': builtins.int} -- <class 'ast.Constant'> <class 'mypy.nodes.IntExpr'> {'value': 2, 'type': Literal[2]?} -- <class 'ast.Expr'> <class 'mypy.nodes.ExpressionStmt'> {} -- <class 'ast.Call'> <class 'mypy.nodes.CallExpr'> {'arg_kinds': [0], 'arg_names': [None], 'is_analyzed': False} -- <class 'ast.Name'> <class 'mypy.nodes.NameExpr'> {'name': 'print', 'fullname': 'builtins.print', 'kind': 1, 'is_new_def': False, 'is_special_form': False, 'is_inferred_def': False, 'is_alias_rvalue': False, 'type': def (*values: builtins.object, *, sep: Union[builtins.str, None] =, end: Union[builtins.str, None] =, file: Union[_typeshed.SupportsWrite[builtins.str], None] =, flush: builtins.bool =)} -- <class 'ast.Compare'> <class 'mypy.nodes.ComparisonExpr'> {'operators': ['is'], 'method_types': [None, None, None], 'type': builtins.bool} -- <class 'ast.Name'> <class 'mypy.nodes.NameExpr'> {'name': 'a', 'fullname': 'ex.a', 'kind': 1, 'is_new_def': False, 'is_special_form': False, 'is_inferred_def': False, 'is_alias_rvalue': False, 'type': builtins.int} -- <class 'ast.Name'> <class 'mypy.nodes.NameExpr'> {'name': 'b', 'fullname': 'ex.b', 'kind': 1, 'is_new_def': False, 'is_special_form': False, 'is_inferred_def': False, 'is_alias_rvalue': False, 'type': builtins.int}
I am going to release it soon.
Hi @sobolevn That sounds good! It's already available? I looking for how to annotate the ast with mypy information for my project.
Hm, dmypy (mypy's daemon mode) has much of the same information available, there's just no API for it yet. We do have an experimental API that suggests the signature for an unannotated function based on how it's called (
dmypy suggest
).Maybe we should develop something similar to Pyre's API? Maybe we could even just copy the same API style, to make it easier for clients to switch.
I think the most relevant query from pyre
is the list types from a file(s). The output is quite simple. Is just a list containing the annotations for each token
There are many more queries in pyre. But being able to get a list of types direct from dmypy or mypy will be enough.
@sobolevn did you release that?
I working on something to address that https://github.com/pyastrx/pyastrx/pull/44
But this is just a workaround, later on, I'll try to create a better and faster way to do this.
mypyq file1.py file2.py
I have made a tool to enhance
ast
with metadata frommypy
:import sys a = 1 b = 2 print(a is b)
Output:
» typed-linter ex.py Original AST: Module(body=[Import(names=[alias(name='sys')]), Assign(targets=[Name(id='a', ctx=Store())], value=Constant(value=1)), Assign(targets=[Name(id='b', ctx=Store())], value=Constant(value=2)), Expr(value=Call(func=Name(id='print', ctx=Load()), args=[Compare(left=Name(id='a', ctx=Load()), ops=[Is()], comparators=[Name(id='b', ctx=Load())])], keywords=[]))], type_ignores=[]) Format: -- ast.Node mypy.Node metdata -- <class 'ast.Module'> <class 'mypy.nodes.MypyFile'> {'fullname': 'ex', 'is_stub': False, 'path': 'ex.py', 'is_partial_stub_package': False, 'is_package_init_file': False, 'names': {'__builtins__': {'.class': 'SymbolTableNode', 'kind': 'Gdef', 'cross_ref': 'builtins', 'type': None}, '__name__': {'.class': 'SymbolTableNode', 'kind': 'Gdef', 'cross_ref': 'ex.__name__', 'type': 'builtins.str'}, '__doc__': {'.class': 'SymbolTableNode', 'kind': 'Gdef', 'cross_ref': 'ex.__doc__', 'type': 'builtins.str'}, '__file__': {'.class': 'SymbolTableNode', 'kind': 'Gdef', 'cross_ref': 'ex.__file__', 'type': 'builtins.str'}, '__package__': {'.class': 'SymbolTableNode', 'kind': 'Gdef', 'cross_ref': 'ex.__package__', 'type': 'builtins.str'}, 'sys': {'.class': 'SymbolTableNode', 'kind': 'Gdef', 'module_hidden': True, 'module_public': False, 'cross_ref': 'sys', 'type': None}, 'a': {'.class': 'SymbolTableNode', 'kind': 'Gdef', 'cross_ref': 'ex.a', 'type': 'builtins.int'}, 'b': {'.class': 'SymbolTableNode', 'kind': 'Gdef', 'cross_ref': 'ex.b', 'type': 'builtins.int'}}, 'imports': [{'is_unreachable': False, 'is_top_level': True, 'is_mypy_only': False, 'assignments': [], 'class_name': 'Import', 'ids': [{'imported': 'sys', 'alias': None}]}]} -- <class 'ast.Import'> <class 'mypy.nodes.Import'> {'is_unreachable': False, 'is_top_level': True, 'is_mypy_only': False, 'assignments': [], 'class_name': 'Import', 'ids': [{'imported': 'sys', 'alias': None}]} -- <class 'ast.Assign'> <class 'mypy.nodes.AssignmentStmt'> {'type': builtins.int, 'unanalyzed_type': None, 'new_syntax': False, 'is_alias_def': False, 'is_final_def': False} -- <class 'ast.Name'> <class 'mypy.nodes.NameExpr'> {'name': 'a', 'fullname': 'ex.a', 'kind': 1, 'is_new_def': True, 'is_special_form': False, 'is_inferred_def': False, 'is_alias_rvalue': False, 'type': builtins.int} -- <class 'ast.Constant'> <class 'mypy.nodes.IntExpr'> {'value': 1, 'type': Literal[1]?} -- <class 'ast.Assign'> <class 'mypy.nodes.AssignmentStmt'> {'type': builtins.int, 'unanalyzed_type': None, 'new_syntax': False, 'is_alias_def': False, 'is_final_def': False} -- <class 'ast.Name'> <class 'mypy.nodes.NameExpr'> {'name': 'b', 'fullname': 'ex.b', 'kind': 1, 'is_new_def': True, 'is_special_form': False, 'is_inferred_def': False, 'is_alias_rvalue': False, 'type': builtins.int} -- <class 'ast.Constant'> <class 'mypy.nodes.IntExpr'> {'value': 2, 'type': Literal[2]?} -- <class 'ast.Expr'> <class 'mypy.nodes.ExpressionStmt'> {} -- <class 'ast.Call'> <class 'mypy.nodes.CallExpr'> {'arg_kinds': [0], 'arg_names': [None], 'is_analyzed': False} -- <class 'ast.Name'> <class 'mypy.nodes.NameExpr'> {'name': 'print', 'fullname': 'builtins.print', 'kind': 1, 'is_new_def': False, 'is_special_form': False, 'is_inferred_def': False, 'is_alias_rvalue': False, 'type': def (*values: builtins.object, *, sep: Union[builtins.str, None] =, end: Union[builtins.str, None] =, file: Union[_typeshed.SupportsWrite[builtins.str], None] =, flush: builtins.bool =)} -- <class 'ast.Compare'> <class 'mypy.nodes.ComparisonExpr'> {'operators': ['is'], 'method_types': [None, None, None], 'type': builtins.bool} -- <class 'ast.Name'> <class 'mypy.nodes.NameExpr'> {'name': 'a', 'fullname': 'ex.a', 'kind': 1, 'is_new_def': False, 'is_special_form': False, 'is_inferred_def': False, 'is_alias_rvalue': False, 'type': builtins.int} -- <class 'ast.Name'> <class 'mypy.nodes.NameExpr'> {'name': 'b', 'fullname': 'ex.b', 'kind': 1, 'is_new_def': False, 'is_special_form': False, 'is_inferred_def': False, 'is_alias_rvalue': False, 'type': builtins.int}
I am going to release it soon.
Hi @sobolevn , have you release it? It will be really useful for one of my projects. I am inclined to use libCST to get type annotations in my AST, however setting a working Pyre environment is cumbersome, so I prefer to have a mypy based tool for that.
I've something already doing that but I forgot about this Issue. I can work on this issue and send a PR.
I have made a tool to enhance
ast
with metadata frommypy
: I am going to release it soon.
Hi @sobolevn, this would be extremely useful for me and probably others. Have you released this? If not, can you release the (partial) source code? Thanks!
@GideonBear no, it is too unreliable. Source is here: https://github.com/wemake-services/typed-linter/tree/master/typed_linter/contrib/mypy
Source is here: https://github.com/wemake-services/typed-linter/tree/master/typed_linter/contrib/mypy
@sobolevn Is it on private? https://github.com/wemake-services/typed-linter is a 404 for me.
Maybe this can help you @GideonBear , https://github.com/pyastrx/pyastrx/tree/main/pyastrx/inference .
Also you use this after installing pyastrx
mypyq -f test.py
@devmessias what is your recommended approach to get the inferred type information within the AST?
Is there a viable solution now? I saw that you've been busy getting related PRs approved across various projects.
Knowing the type of the expressions in the AST is useful for IDEs (and IDE plugins) to support autocompletion, for static analyzers and for refactoring tools. As a matter of fact, both
jedi
andpylint
have their own heuristics for type inference.In my specific use case I would like to use the type information to write a safe refactoring tool. The idea is to be able to say that you want to refactor specific methods. For example one could want to refactor
string.find
intostring.index
, as:where
expr
is any string expression, like'string'
,string_var
,(var + 'foo')
,string_var.replace(' ', '-')
, etc.into
To safely do this refactoring, one needs to be able to query the type of sub expressions, given their location in the source file. Otherwise, given that
find
is a common name present in many different classes, the refactoring, would blindly be applied to all of them.Note: this is the feature request version of issue #4713.