Closed abulka closed 6 years ago
Is the linked pull request what you need?
Looks promising!
I changed it actually, and committed to master, with some tests. Turns out there is no need for include_extra
parameter, it should just always use True. So if you are looking for tokenize.COMMENT
, it will now find it, and if you were looking for a regular token, it works the same as before. So the interface is the same, but your use case is fixed.
I'm closing, but let me know if you still have any issues with this.
Thanks - any idea when the new version will be available via pip?
Just published.
Might have found a problem - or maybe its the way I'm using the library. When I scan for comments on a node, the next comment in the entire source code is found - regardless of how far away it is. I need to find comments only on the line that the node is part of.
Here is the repro of the weird behaviour:
import ast
import asttokens
import tokenize
from textwrap import dedent
src = dedent("""
def hello():
x = 5
there()
def there():
return 999 # my silly comment
hello() # call it
there()
""")
class RecursiveVisitor(ast.NodeVisitor):
""" example recursive visitor """
def recursive(func):
""" decorator to make visitor work recursive """
def wrapper(self,node):
self.dump_line_and_comment(node)
func(self,node)
for child in ast.iter_child_nodes(node):
self.visit(child)
return wrapper
def dump_line_and_comment(self, node):
comment = atok.find_token(node.first_token, tokenize.COMMENT)
print(f'On line "{node.first_token.line.strip():20s}" find_token found "{comment}"')
@recursive
def visit_Assign(self,node):
""" visit a Assign node and visits it recursively"""
@recursive
def visit_BinOp(self, node):
""" visit a BinOp node and visits it recursively"""
@recursive
def visit_Call(self,node):
""" visit a Call node and visits it recursively"""
@recursive
def visit_Lambda(self,node):
""" visit a Function node """
@recursive
def visit_FunctionDef(self,node):
""" visit a Function node and visits it recursively"""
atok = asttokens.ASTTokens(src, parse=True)
tree = atok.tree
visitor = RecursiveVisitor()
visitor.visit(tree)
Gives me:
On line "def hello(): " find_token found "COMMENT:'# my silly comment'"
On line "x = 5 " find_token found "COMMENT:'# my silly comment'"
On line "there() " find_token found "COMMENT:'# my silly comment'"
On line "def there(): " find_token found "COMMENT:'# my silly comment'"
On line "hello() # call it " find_token found "COMMENT:'# call it'"
On line "there() " find_token found "ENDMARKER:''"
That's not a problem with this module, it's just not a feature of it: find_token
finds the next matching token regardless of the line. But line breaks themselves introduce tokens, so you can write a helper to find the next comment on the same line as a given token, like so:
def find_line_comment(atok, start_token):
t = start_token
while t.type not in (tokenize.COMMENT, tokenize.NL, tokenize.NEWLINE, token.ENDMARKER):
t = atok.next_token(t, include_extra=True)
return t if t.type == tokenize.COMMENT else None
Thanks - that helper routine works great. A slight tweak I made is to either return the comment string or an empty string:
def find_line_comment(start_token):
t = start_token
while t.type not in (tokenize.COMMENT, tokenize.NL, tokenize.NEWLINE, tokenize.ENDMARKER):
t = self.atok.next_token(t, include_extra=True)
return t.string if t.type == tokenize.COMMENT else ''
comment = find_line_comment(node.first_token)
P.S. My old hack approach was not very 'token' based and for the curious, was simply:
line = node.first_token.line
comment_i = line.find('#')
comment = line[comment_i:].strip() if comment_i != -1 else ''
I'm trying to find the comment related to an AST node of the python source code I am analysing:
I tried
where
node
is an AST node e.g. an AST Name node for 'x'The find never works. Looking at the source code of asttokens, I think it is because when
find_token()
callsnext_token()
to iterate through the tokens, it never passes through True to theinclude_extra
parameter ofnext_token()
.Any chance of adding an
include_extra
parameter tofind_token()
and passing that through tonext_token()
? You seem to have that parameter everywhere else!