atom / language-python

Python package for Atom
Other
191 stars 147 forks source link

Incorrect docstring handling #55

Open graingert opened 9 years ago

graingert commented 9 years ago

The syntax highlighting for https://github.com/farcepest/MySQLdb1/blob/9b0b59f1bbe0029757b9b15d1c71586eaca1109c/MySQLdb/cursors.py

Seems to be incorrect, it marks most of the file as a string.

erbridge commented 9 years ago

See also e.g. https://github.com/cobrateam/splinter/blob/master/splinter/browser.py#L46. Everything between the """s should be quoted.

alexjurkiewicz commented 9 years ago

Seems to be that the issue relates to the following corner case:

Here are three instances of the bug that result in slightly different syntax:

Break 1:

before()
r'''(\)'''
after()

Break 2:

before()
r'''(\'''
after()

Break 3:

before()
r'''[\'''
after()

The last example shows different highlighting for after() -- I think in the original case it's not that the docstring fails to end, but that the active "highlighting mode" is affected by characters in the string somehow.

Some working examples:

Works 1:

before()
r'(\)'
after()

Works 2:

before()
r'''\''''
after()

Works 3:

before()
r'''\]'''
after()
auscompgeek commented 9 years ago

Looks like the grammar is possibly trying to greedily highlight regex and hence not bailing when the regex is invalid but the string is closed.

This doesn't quite explain what's going on in the MySQLdb1 file linked in the OP though, though that might just be the regex matching gone wrong as well.

auscompgeek commented 9 years ago

@alexjurkiewicz actually, after testing your snippets in a Python REPL, your second snippet is semi-intended behaviour. Python considers r'''(\''' to be an unclosed string, although it seems to be stuck in regex mode after the open parenthesis:

r'''(\'''
'''
hello()

Similar issue here, except we're dealing with regex classes:

r'''[\'''
'''
hello()

This works as expected/intended:

r'''(\'''
)'''
test()

r'''[\'''
]'''
test()
auscompgeek commented 9 years ago

Guess this is just #39 then?

alexjurkiewicz commented 8 years ago

yes, confusing!