gristlabs / asttokens

Annotate Python AST trees with source text and token information
Apache License 2.0
172 stars 34 forks source link

xonsh shell tree #62

Closed anki-code closed 4 years ago

anki-code commented 4 years ago

Hello! Thank you for asttokens!

I'm wondering can I use asttokens with xonsh shell syntax tree that is a superset of Python?

For example I replaced tree to xonsh and it works with pure python:

# To run this code just do `pip install xonsh`, run `xonsh` and copy-paste this code
import ast, asttokens
st='''
def greet(a):
  return b
'''
atok = asttokens.ASTTokens(
                  source_text=st, 
                  parse=False,
                  tree=__xonsh__.execer.parse(st, ctx=__xonsh__.ctx) # ast.Expression
                 )

for node in ast.walk(atok.tree):
  if hasattr(node, 'lineno'):
    print(atok.get_text_range(node), node.__class__.__name__, atok.get_text(node))

Result:

(1, 25) FunctionDef def greet(a):
  return b
(17, 25) Return return b
(11, 12) arg a
(24, 25) Name b

The token positions - it's what I need.

But when I set:

st="echo 1" # subprocess command

I've got traceback:

# Run `$XONSH_SHOW_TRACEBACK = True` in xonsh to show traceback
Traceback (most recent call last):
  File "/opt/miniconda/lib/python3.8/site-packages/xonsh/base_shell.py", line 362, in default
    run_compiled_code(code, self.ctx, None, "single")
  File "/opt/miniconda/lib/python3.8/site-packages/xonsh/codecache.py", line 67, in run_compiled_code
    func(code, glb, loc)
  File "<xonsh-code>", line 3, in <module>
  File "/opt/miniconda/lib/python3.8/site-packages/asttokens/asttokens.py", line 65, in __init__
    self.mark_tokens(self._tree)
  File "/opt/miniconda/lib/python3.8/site-packages/asttokens/asttokens.py", line 76, in mark_tokens
    MarkTokens(self).visit_tree(root_node)
  File "/opt/miniconda/lib/python3.8/site-packages/asttokens/mark_tokens.py", line 49, in visit_tree
    util.visit_tree(node, self._visit_before_children, self._visit_after_children)
  File "/opt/miniconda/lib/python3.8/site-packages/asttokens/util.py", line 199, in visit_tree
    ret = postvisit(current, par_value, value)
  File "/opt/miniconda/lib/python3.8/site-packages/asttokens/mark_tokens.py", line 92, in _visit_after_children
    nfirst, nlast = self._methods.get(self, node.__class__)(node, first, last)
  File "/opt/miniconda/lib/python3.8/site-packages/asttokens/mark_tokens.py", line 189, in handle_attr
    name = self._code.next_token(dot)
  File "/opt/miniconda/lib/python3.8/site-packages/asttokens/asttokens.py", line 141, in next_token
    while is_non_coding_token(self._tokens[i].type):
IndexError: list index out of range

Could you please help or advice can I achieve token positions using asttokens and xonsh shell ast parser? If no could you point out to another way?

Thanks!

dsagal commented 4 years ago

The xonsh parser produces its own AST. Looking at the source code, it seems to use enough Python that its AST is likely an extension of Python's AST, with extra tokens and nodes. You'd need to fork asttokens and make it aware of these addititions, for it to be able to parse xonsh AST. It's hard to say how much work that would be. It might be easy for someone familiar with xonsh internals, but hard for everyone else :)

anki-code commented 4 years ago

Got it! Thanks!

alexmojaki commented 4 years ago

Just to add to what @dsagal said, asttokens has to specially handle many kinds of nodes whose tokens are not correctly inferred by the initial generic algorithm. For example in the traceback it says it was trying to specially handle an attribute node and failed because it couldn't find a dot. Does xonsh have a different concept of Attribute in the AST? Either way, you would probably need to add special cases for some (probably not all) of the additional nodes found in xonsh.

scopatz commented 4 years ago

FWIW xonsh produces a plain-old Python AST with no additional nodes.

anki-code commented 4 years ago

I tried to walk the tree:

```python # To run this code just do `pip install xonsh`, run `xonsh` and copy-paste this code import ast def str_node(node): if isinstance(node, ast.AST): fields = [(name, str_node(val)) for name, val in ast.iter_fields(node) if name not in ('left', 'right')] rv = '%s(%s' % (node.__class__.__name__, ', '.join('%s=%s' % field for field in fields)) return rv + ')' else: return repr(node) def ast_visit(node, level=0): print(' ' * level + str_node(node)) for field, value in ast.iter_fields(node): if isinstance(value, list): for item in value: if isinstance(item, ast.AST): ast_visit(item, level=level+1) elif isinstance(value, ast.AST): ast_visit(value, level=level+1) cmd = 'echo @("hello") | head' ast_visit(__xonsh__.execer.parse(cmd, ctx=__xonsh__.ctx)) ```

And got:

``` Expression(body=Call(func=Attribute(value=Name(id='__xonsh__', ctx=Load()), attr='subproc_captured_hiddenobject', ctx=Load()), args=[<_ast.BinOp object at 0x7f374f478460>, <_ast.Constant object at 0x7f374f584850>, <_ast.List object at 0x7f374f478610>], keywords=[])) Call(func=Attribute(value=Name(id='__xonsh__', ctx=Load()), attr='subproc_captured_hiddenobject', ctx=Load()), args=[<_ast.BinOp object at 0x7f374f478460>, <_ast.Constant object at 0x7f374f584850>, <_ast.List object at 0x7f374f478610>], keywords=[]) Attribute(value=Name(id='__xonsh__', ctx=Load()), attr='subproc_captured_hiddenobject', ctx=Load()) Name(id='__xonsh__', ctx=Load()) Load() Load() BinOp(op=Add()) List(elts=[<_ast.Call object at 0x7f374f9f2a90>], ctx=Load()) Call(func=Attribute(value=Name(id='__xonsh__', ctx=Load()), attr='expand_path', ctx=Load()), args=[<_ast.Constant object at 0x7f374f9f23a0>], keywords=[]) Attribute(value=Name(id='__xonsh__', ctx=Load()), attr='expand_path', ctx=Load()) Name(id='__xonsh__', ctx=Load()) Load() Load() Constant(value='echo') Load() Add() Call(func=Attribute(value=Name(id='__xonsh__', ctx=Load()), attr='list_of_strs_or_callables', ctx=Load()), args=[<_ast.Constant object at 0x7f374f478d60>], keywords=[]) Attribute(value=Name(id='__xonsh__', ctx=Load()), attr='list_of_strs_or_callables', ctx=Load()) Name(id='__xonsh__', ctx=Load()) Load() Load() Constant(value='hello') Constant(value='|') List(elts=[<_ast.Call object at 0x7f374f478c70>], ctx=Load()) Call(func=Attribute(value=Name(id='__xonsh__', ctx=Load()), attr='expand_path', ctx=Load()), args=[<_ast.Constant object at 0x7f374f478b20>], keywords=[]) Attribute(value=Name(id='__xonsh__', ctx=Load()), attr='expand_path', ctx=Load()) Name(id='__xonsh__', ctx=Load()) Load() Load() Constant(value='head') Load() ```

Here I see no any fields with source position information. It means there is no data about it or the place where I'm searching is wrong?

anki-code commented 4 years ago

Oh, I see, we can just avoid trying to specially handle an attribute node. I'm working on PR.

alexmojaki commented 4 years ago

Here I see no any fields with source position information. It means there is no data about it or the place where I'm searching is wrong?

It's not present in ast.iter_fields(node), but the attributes are there. You can find them in the __dict__. Or you can use ast.dump(node, include_attributes=True, indent=2), but you will need Python 3.9 for indent=2.

anki-code commented 4 years ago

Thank you for you help and fast responses! We discussed the questions in the PR and I have info to continue thinking.