Closed fyrestone closed 6 years ago
It looks like the standard ast
module will handle encoding declaration (like # -*- coding: UTF-8 -*-
) if the content is bytes, but will reject an encoding declaration if the content is unicode. Here's an example:
>>> ast.parse("# -*- coding: UTF-8 -*-\nprint 'foo'")
<_ast.Module object at 0x101231c50>
>>> ast.parse(u"# -*- coding: UTF-8 -*-\nprint 'foo'")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ast.py", line 37, in parse
return compile(source, filename, mode, PyCF_ONLY_AST)
File "<unknown>", line 0
SyntaxError: encoding declaration in Unicode string
So either pass in your source code as bytes, or if you pass in unicode, don't include encoding declarations in the source.
Try calling asttokens.ASTTokens(s.encode('utf8'), parse=True)
. If that's not enough, please include a reproducible example, and I'll try to help.
Thank you very much, I found it is caused by chardet auto detection. Some files are converted to utf-8 by a wrong codec.
Glad it's resolved.