RustPython / Parser

MIT License
67 stars 26 forks source link

Generator expression without parenthesis does not raise SyntaxError #84

Open key262yek opened 2 years ago

key262yek commented 2 years ago

Feature

In CPython, generator expression without parenthesis raises following error

>>> foo(x, z for z in range(10), t, w)
  File "<stdin>", line 1
    foo(x, z for z in range(10), t, w)
           ^^^^^^^^^^^^^^^^^^^^
SyntaxError: Generator expression must be parenthesized

However, In RustPython, it does not matter

>>>>> foo(x, z for z in range(10), t, w)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'foo' is not defined

Python Documentation

youknowone commented 2 years ago

This is allowed only for single-argument function call.

In python.lalrpop I see FunctionArgument is including Generator stuff. I guess that part is expected to be moved to ArgumentList part.

key262yek commented 2 years ago

I try to change parse_args function in ArgumentList

ArgumentList: ArgumentList = {
    <e: Comma<FunctionArgument>> =>? {
        let arg_list = parse_args(e)?;
        Ok(arg_list)
    }
};

First, I checked whether the generator expression is single-argument or not.

// In parser/src/functions.rs
pub fn parse_args(func_args: Vec<FunctionArgument>) -> Result<ArgumentList, LexicalError> {
    let mut args = vec![];
    let mut keywords = vec![];
// <<<< New
    let len = func_args.len();    // read length of FunctionArgument
// <<<<
    let mut keyword_names = HashSet::with_capacity_and_hasher(func_args.len(), RandomState::new());
    for (name, value) in func_args {
        match name {
            Some((location, name)) => {
                ..
            }
            None => {
                ..
// <<<< New
                if let ast::ExprKind::GeneratorExp { elt: e, .. } = &value.node {
                    if len != 1{
                        return Err(LexicalError {
                            error: LexicalErrorType::OtherError(
                                "Generator expression must be parenthsized".to_string(),
                            ),
                            location: e.location,
                        });
                    }
                }
// <<<<
                args.push(value);
            }
        }
    }
    Ok(ArgumentList { args, keywords })
}

But, in this way, we cannot figure out that the generator at the first time were parenthesized or not. Following statements all raise errors.

>>>>> foo(x, z for z in range(10), t, w)
SyntaxError: Generator expression must be parenthsized at line 1 column 8
foo(x, z for z in range(10), t, w)
       ^
>>>>> foo(x, (z for z in range(10)), t, w)
SyntaxError: Generator expression must be parenthsized at line 1 column 8
foo(x, (z for z in range(10)), t, w)
        ^

It's because, as we can see in the parsing algorithm of function argument, NamedExpressionTest can be a generator expression itself.

FunctionArgument: (Option<(ast::Location, Option<String>)>, ast::Expo) = {
    <e:NamedExpressionTest> <c:CompFor?> => {    // <<<< Here
        let mut parenthesized = true;
        let expr = match c {
            Some(c) => ast::Expr {
                    location: e.location,
                    custom: (),
                    node: ast::ExprKind::GeneratorExp {
                        elt: Box::new(e),
                        generators: c,
                    }
            },
            None => e,
        };
        (None, expr)
    },
    ..
}

// In NamedExpressionTest parsing
  <location:@L> "(" <elt:Test> <generators:CompFor> ")" => {
        ast::Expr {
            location,
            custom: (),
            node: ast::ExprKind::GeneratorExp { elt: Box::new(elt), generators }
        }
    },

Hence, I add boolean attribute to result of FunctionArgument parsing, which contains whether the generator expression were parenthesized or not.

FunctionArgument: (Option<(ast::Location, Option<String>)>, ast::Expr, bool) = { // <<< Here
    <e:NamedExpressionTest> <c:CompFor?> => {
        let mut parenthesized = true;        // <<< New
        let expr = match c {
            Some(c) => {
                parenthesized = false;          // <<< New
                ast::Expr {
                    location: e.location,
                    custom: (),
                    node: ast::ExprKind::GeneratorExp {
                        elt: Box::new(e),
                        generators: c,
                    }
                }
            },
            None => e,
        };
        (None, expr, parenthesized)          // <<<< New
    },
    ..
}

and change the parse_args function as follows.

// <<<< New
type FunctionArgument = (Option<(ast::Location, Option<String>)>, ast::Expr, bool);
// <<<<

pub fn parse_args(func_args: Vec<FunctionArgument>) -> Result<ArgumentList, LexicalError> {
    let mut args = vec![];
    let mut keywords = vec![];
// <<<< New
    let len = func_args.len();
// <<<<
    let mut keyword_names = HashSet::with_capacity_and_hasher(func_args.len(), RandomState::new());
    for (name, value, parenthsized) in func_args {
        match name {
            Some((location, name)) => {
                ..
            }
            None => {
                ..

// <<<< New
                if let ast::ExprKind::GeneratorExp { elt: e, .. } = &value.node {
                    if len != 1 && !parenthsized {
                        return Err(LexicalError {
                            error: LexicalErrorType::OtherError(
                                "Generator expression must be parenthsized".to_string(),
                            ),
                            location: e.location,
                        });
                    }
                }
// <<<<
                args.push(value);
            }
        }
    }
    Ok(ArgumentList { args, keywords })
}

Question

Question is that, does it seem okay to add new attribute in lalrpop? Is there any other way to resolve it without adding new structure? any idea?

youknowone commented 2 years ago

More above than ArgumentList.

cpython:

>>> class A(x for x in range(10)): pass
  File "<stdin>", line 1
    class A(x for x in range(10)): pass
           ^
SyntaxError: expected ':'

RustPython:

>>>>> class A(x for x in range(10)): pass
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: metaclass conflict: the metaclass of a derived class must be a (non-strict) subclass of the metaclasses of all its bases

This is SyntaxError but RustPython is not catching it.

In CPython, this is handled by t_primary: https://github.com/python/cpython/blob/3.10/Grammar/python.gram#L823-L824

which not used for class definition. (class_def is using arguments https://github.com/python/cpython/blob/3.10/Grammar/python.gram#L489-L495)

My suggestion is creating one more layer like cpython.