Closed DRMacIver closed 1 year ago
We already check that our input is syntactically valid:
...but per #92, the problem is actually libcst
failing to parse valid code; we can and should check for that and give a more specific error message in such cases.
(I've also occasionally thought about adding minimization, via e.g. https://pypi.org/project/pysource-minimize/ or at worst https://github.com/Zac-HD/mwe)
If so something is going wrong with this check, because here's another example I ran into:
from collections import defaultdict
VALUES = defaultdict(
int, (i, i) for i in range(10)
)
This isn't valid Python:
File "/Users/drmaciver/scratch/shedbug.py", line 4
int, (i, i) for i in range(10)
^^^^^^^^^^^^^^^^^^^^^^^^^
SyntaxError: Generator expression must be parenthesized
But the following is the error I get when I call shed --refactor
on it:
Internal error formatting 'shedbug.py': ParserSyntaxError: Syntax Error @ 3:1.
parser error: error at 3:37: expected one of !=, %, &, (, *, **, +, ,, -, ., /, //, <, <<, <=, ==, >, >=, >>, @, [, ^, and, if, in, is, not, or, |
VALUES = defaultdict(int, (i, i) for i in range(10))
^
Please report this to https://github.com/Zac-HD/shed/issues
Right, I did a bit of investigation. The problem scenario is when we have input code which satisfies the following criteria:
The reason why this is firing with shed --refactor
turns out to have nothing to do with any changes shed is making to the code, but is because when running codemods shed forces the use of the native parser: https://github.com/Zac-HD/shed/blob/master/src/shed/_codemods.py#L62
So all that happens is that the code gets to the point of trying to parse the code with libcst's native parser, correctly gets a syntax error, and reports a confusing error message.
Fixed in 23.5.1 / https://github.com/Zac-HD/shed/commit/baf7705712f87b3c8856bb912204d8a64898e590 by... skipping the libcst logic when it (correctly) rejects invalid code. Potentially a bit confusing if you expected refactorings to happen, but all the other options I could think of seemed worse 😅
Very often shed will claim to have a bug when given invalid Python that libcst fails to reject. This is at best confusing to the end user.
libcst would ideally never do this, but that is not the world we live in, so I propose the following fix: When shed would show the "please report this as a bug" error, it should instead first parse the original source code with
ast.parse
and check if it's actually valid Python, and show a different error message if it's not.