astral-sh / ruff

An extremely fast Python linter and code formatter, written in Rust.
https://docs.astral.sh/ruff
MIT License
29.32k stars 958 forks source link

Invalid python files, that ruff parse #7633

Open qarmin opened 10 months ago

qarmin commented 10 months ago

Ruff 0.0.291 Cpython 3.11.4

Tested with

ruff format . --check

and cpython part

python -m compileall .

https://github.com/qarmin/Automated-Fuzzer/releases/download/test/Cpython.no.parsable.parsable.by.ruff.zip

This makes difficult to report fuzzer problems, because I cannot with only ruff recognize if this is valid python file or not

Example of files

*f
break
yield #
#coding:ut
from pydanic import BaseModel

class HostedResult(BaseModel):
    id: str
  url: str

Example of cpython errors

    *Ï
    ^^^
SyntaxError: can't use starred expression here
*** Sorry: IndentationError: unindent does not match any outer indentation level (98_IDX_0_RAND_10049945704112701812.py, line 45)
    grouping() = "Right"
    ^^^^^^^^^^
SyntaxError: cannot assign to function call here. Maybe you meant '==' instead of '='?
***   File "./990_IDX_0_RAND_14720601008565103604.py", line 107
    self.u<= a.rSDP = None
    ^^^^^^^^^^^^^^^
SyntaxError: cannot assign to comparison
SyntaxError: unknown encoding: u
    'id': (str,await  none_type,),  # noqa: E501
               ^^^^^^^^^^^^^^^^
SyntaxError: 'await' outside async function
    from __future__ import absolute_im
    ^
SyntaxError: future feature absolute_im is not defined
dhruvmanila commented 10 months ago

Related https://github.com/astral-sh/ruff/issues/6895

qarmin commented 10 months ago

List of all cpython errors visible in files in first post

SyntaxError: 'ascii' codec can't decode byte 0xc2 in position 16: ordinal not in range(128)
SyntaxError: 'async for' outside async function
SyntaxError: 'async with' outside async function
SyntaxError: 'await expression' can not be used within an annotation
SyntaxError: 'await' outside async function
SyntaxError: 'await' outside function
SyntaxError: 'break' outside loop
SyntaxError: 'comparison' is an illegal expression for augmented assignment
SyntaxError: 'continue' not properly in loop
SyntaxError: 'expression' is an illegal expression for augmented assignment
SyntaxError: 'list' is an illegal expression for augmented assignment
SyntaxError: 'literal' is an illegal expression for augmented assignment
SyntaxError: 'return' outside function
SyntaxError: 'rot13' is not a text encoding; use codecs.decode() to handle arbitrary codecs
SyntaxError: 'tuple' is an illegal expression for augmented assignment
SyntaxError: 'utf7' codec can't decode byte 0xc2 in position 22: unexpected special character
SyntaxError: 'yield from' inside async function
SyntaxError: 'yield' inside generator expression
SyntaxError: 'yield' outside function
SyntaxError: Generator expression must be parenthesized
SyntaxError: asynchronous comprehension outside of an asynchronous function
SyntaxError: can't use starred expression here
SyntaxError: cannot assign to False
SyntaxError: cannot assign to None
SyntaxError: cannot assign to True
SyntaxError: cannot assign to attribute here. Maybe you meant '==' instead of '='?
SyntaxError: cannot assign to await expression
SyntaxError: cannot assign to await expression here. Maybe you meant '==' instead of '='?
SyntaxError: cannot assign to comparison
SyntaxError: cannot assign to expression
SyntaxError: cannot assign to expression here. Maybe you meant '==' instead of '='?
SyntaxError: cannot assign to function call
SyntaxError: cannot assign to function call here. Maybe you meant '==' instead of '='?
SyntaxError: cannot assign to lambda
SyntaxError: cannot assign to literal
SyntaxError: cannot assign to literal here. Maybe you meant '==' instead of '='?
SyntaxError: cannot assign to set display here. Maybe you meant '==' instead of '='?
SyntaxError: cannot assign to subscript here. Maybe you meant '==' instead of '='?
SyntaxError: cannot delete function call
SyntaxError: cannot delete literal
SyntaxError: default 'except:' must be last
SyntaxError: encoding problem: utI-8 with BOM
SyntaxError: encoding problem: utf-int with BOM
SyntaxError: expected 'else' after 'if' expression
SyntaxError: f-string: cannot use starred expression here
SyntaxError: from __future__ imports must occur at the beginning of the file
SyntaxError: future feature Faimportlseannotations is not defined
SyntaxError: illegal target for annotation
SyntaxError: import * only allowed at module level
SyntaxError: invalid character '¦' (U+00A6)
SyntaxError: invalid non-printable character U+0084
SyntaxError: invalid syntax
SyntaxError: invalid syntax. Maybe you meant '==' or ':=' instead of '='?
SyntaxError: iterable unpacking cannot be used in comprehension
SyntaxError: multiple starred expressions in assignment
SyntaxError: name 'LOG' is used prior to global declaration
SyntaxError: no binding for nonlocal 'card_st' found
SyntaxError: nonlocal declaration not allowed at module level
SyntaxError: not a chance
SyntaxError: starred assignment target must be in a list or tuple
SyntaxError: too many statically nested blocks
SyntaxError: unexpected EOF while parsing
SyntaxError: unknown encoding: -t1-8
SyntaxError: unterminated string literal (detected at line 13)
SyntaxError: wildcard makes remaining patterns unreachable
MichaReiser commented 3 months ago

@dhruvmanila this could be an interesting source for obscure parser tests ;)

dhruvmanila commented 3 months ago

Interesting! I think a lot of this messages shouldn't really be handled by the parser. Basically, a lot of them are soft syntax error which means they're not raised at the parsing step. For example, 'async with' outside async function, nonlocal declaration not allowed at module level, 'break' outside loop, etc. But, this would be useful in the next phase to emit these errors as diagnostics.