PyCQA / redbaron

Bottom-up approach to refactoring in python
http://redbaron.pycqa.org/
694 stars 74 forks source link

Incorrect parsing of Unicode Literals. Got - UntreatedError: Untreated elements #208

Open Erotemic opened 2 years ago

Erotemic commented 2 years ago

The following code illustrates an issue in redbaron 0.9.2 where it fails to parse unicode literals

    import textwrap
    import redbaron

    # This works properly
    text = textwrap.dedent(
        '''
        p1, p2 = (1, 2)
        ''').strip('\n')
    red = redbaron.RedBaron(text)

    # But this fails when we use unicode symbols for identifiers
    text = textwrap.dedent(
        '''
        ρ1, ρ2 = (1, 2)
        ''').strip('\n')
    red = redbaron.RedBaron(text)

    # Still fails with a single unicdoe element
    text = textwrap.dedent(
        '''
        ρ2 = 2
        ''').strip('\n')
    red = redbaron.RedBaron(text)

    # Still fails with different unicode identifiers even with explicit
    # unicode literal futures
    text = textwrap.dedent(
        '''
        from __future__ import unicode_literals
        θ = 2
        ''').strip('\n')
    red = redbaron.RedBaron(text)

Essentially, using a unicode character should be valid for a variable name in Python 3, but redbaron does not seem to play nicely with that case.

    # System information
    import sys
    print('sys.version_info = {!r}'.format(sys.version_info))
    import ubelt as ub
    _ = ub.cmd('pip list | grep redbaron', shell=True, verbose=1)

Results in:

sys.version_info = sys.version_info(major=3, minor=8, micro=6, releaselevel='final', serial=0)
redbaron                          0.9.2