PyCQA / redbaron

Bottom-up approach to refactoring in python
694 stars 74 forks source link

Parsing (seemingly) normal file results in error #162

Open dusktreader opened 6 years ago

dusktreader commented 6 years ago

When parsing this exact file:

"""Update MECC title parameters

Revision ID: c7f7c8ec8317
Revises: a804962ee102
Create Date: 2018-01-29 08:42:17.275163


# revision identifiers, used by Alembic.
revision = 'c7f7c8ec8317'
down_revision = '155e6be59a0c'

from alembic import op
import sqlalchemy as sa
from sqlalchemy.dialects import postgresql

def upgrade():
    op.execute("update matcher_interfaces set noise_threshold = 0.2 where name = 'fuzzy_match'")
    op.execute("update mecc_targets set batch_size = 40000 where name = 'MATCHED_TO/title/title'") 

def downgrade():
    op.execute("update matcher_interfaces set noise_threshold = 0.9 where name = 'fuzzy_match'")
    op.execute("update mecc_targets set batch_size = 10000 where name = 'MATCHED_TO/title/title'") 

an IndexError is thrown:

In [1]: from redbaron import RedBaron

In [2]:

In [2]: with open('../cem-data/etc/alembic/versions/') as f:

In [3]: RedBaron(c)
IndexError                                Traceback (most recent call last)
<ipython-input-3-8c06eebe9290> in <module>()
----> 1 RedBaron(c)

~/.pyenv/versions/3.6.5/envs/ambix3.6/lib/python3.6/site-packages/redbaron/ in __init__(self, source_code)
     35         if isinstance(source_code, string_instance):
---> 36             self.node_list = base_nodes.NodeList.from_fst(baron.parse(source_code), parent=self, on_attribute="root")
     37             self.middle_separator = nodes.DotNode({"type": "endl", "formatting": [], "value": "\n", "indent": ""})

~/.pyenv/versions/3.6.5/envs/ambix3.6/lib/python3.6/site-packages/baron/ in parse(source_code, print_function)
     48     if print_function is None:
---> 49         tokens = tokenize(source_code, False)
     50         print_function = has_print_function(tokens)
     51         if print_function:

~/.pyenv/versions/3.6.5/envs/ambix3.6/lib/python3.6/site-packages/baron/ in tokenize(pouet, print_function)
     69 def tokenize(pouet, print_function=False):
---> 70     return mark_indentation(inner_group(space_group(_tokenize(group(split(pouet)), print_function))))

~/.pyenv/versions/3.6.5/envs/ambix3.6/lib/python3.6/site-packages/baron/ in mark_indentation(sequence)
     23 def mark_indentation(sequence):
---> 24     return list(mark_indentation_generator(sequence))

~/.pyenv/versions/3.6.5/envs/ambix3.6/lib/python3.6/site-packages/baron/ in mark_indentation_generator(sequence)
     83         # if we were in an indented situation and that the next line has a lower indentation
     84         if indentations and current[0] == "ENDL":
---> 85             the_indentation_level_changed = get_space(current) is None or get_space(current) != indentations[-1]
     86             if the_indentation_level_changed and iterator.show_next()[0] not in ("ENDL", "COMMENT"):
     87                 new_indent = get_space(current) if len(current) == 4 else ""

~/.pyenv/versions/3.6.5/envs/ambix3.6/lib/python3.6/site-packages/baron/ in get_space(node)
     36     maybe not the best behavior but it seems to work for now.
     37     """
---> 38     if len(node) < 3 or len(node[3]) == 0:
     39         return None
     40     return transform_tabs_to_spaces(node[3][0][1])

IndexError: tuple index out of range
dusktreader commented 6 years ago

It seems to be caused by trailing spaces that were left on lines 19 and 23 of the file. Removing those causes the file to parse fine