PyCQA / baron

IDE allow you to refactor code, Baron allows you to write refactoring code.
http://baron.pycqa.org
GNU Lesser General Public License v3.0
289 stars 50 forks source link

Unindented comment parsing error #122

Open Ahuge opened 7 years ago

Ahuge commented 7 years ago

Hi, I have found an issue which is replicated below:

code = """
def foo(bar):
    if True:
# I cause a Failure
        print("Foo %s!" % bar)
"""
import redbaron

red = redbaron.RedBaron(code)

The error was:

ParsingError: Error, got an unexpected token $end here:

   1 
   2 def foo(bar):
   3     if True:
   4 # I cause a Failure
   5         print("Foo %s!" % bar)
   6 <---- here

The token $end should be one of those: ASSERT, AT, BACKQUOTE, BINARY, BINARY_RAW_STRING, BINARY_STRING, BREAK, CLASS, COMMENT, COMPLEX, CONTINUE, DEDENT, DEF, DEL, ENDL, ENDMARKER, EXEC, FLOAT, FLOAT_EXPONANT, FLOAT_EXPONANT_COMPLEX, FOR, FROM, GLOBAL, HEXA, IF, IMPORT, INT, LAMBDA, LEFT_BRACKET, LEFT_PARENTHESIS, LEFT_SQUARE_BRACKET, LONG, MINUS, NAME, NOT, OCTA, PASS, PLUS, PRINT, RAISE, RAW_STRING, RETURN, STRING, TILDE, TRY, UNICODE_RAW_STRING, UNICODE_STRING, WHILE, WITH, YIELD

I am guessing but it sounds similar to #11 being due to the python parser not caring about comments.

The parsing issue pops up on the dedented single line comment. This is valid in python, only because the python ast strips them probably.

Thanks

Ahuge commented 7 years ago

Pretty sure I have a fix in indentation_marker.py.

The following is a snippet that I will turn into a PR soon.

@@ -61,20 +61,26 @@ def mark_indentation_generator(sequence):
 61:                 indentations.pop()
 62: 
 63:         # if were are at ":\n" like in "if stuff:\n"
-  :         if current[0] == "COLON" and iterator.show_next(1)[0] == "ENDL":
+64:         # Comments can be at whatever indentation they feel like.
+65:         if current[0] in ("COLON", "COMMENT") and iterator.show_next(1)[0] == "ENDL":
 66:             # if we aren't in "if stuff:\n\n"
 67:             if iterator.show_next(2)[0] not in ("ENDL",):
-  :                 indentations.append(get_space(iterator.show_next()))
+68:                 space = get_space(iterator.show_next())
+69:                 if space is not None:
+70:                     indentations.append(space)
 71:                 yield current
 72:                 yield next(iterator)
-  :                 yield ('INDENT', '')
+73:                 if space is not None:
+74:                     yield ('INDENT', '')
 75:                 continue
 76:             else:  # else, skip all "\n"
 77:                 yield current
 78:                 for i in iterator:
 79:                     if i[0] == 'ENDL' and iterator.show_next()[0] not in ('ENDL',):
-  :                         indentations.append(get_space(i))
-  :                         yield ('INDENT', '')
+80:                         space = get_space(i)
+81:                         if space is not None:
+82:                             indentations.append(get_space(i))
+83:                             yield ('INDENT', '')
 84:                         yield i
 85:                         break
 86:                     yield i

I am doing two things, allowing single line comments to have an indentation after their endl as well as making sure that the indentation is not a NoneType.
The latter of the two changes I would like to dig into more to make sure there are no unintended consequences for.

PR should come soonish.
Thanks for reading!

Ahuge commented 7 years ago

Having an issue with test_indentation_marker.test_comment_in_middle_of_ifelseblock

Will have an updated PR soon.