the problem is that in patsy.parse_formula._read_python_expr, patsy tries to figure out whether an arbitrary Python expression is a numeric literal, and the way it does this is by calling int(...) and float(...) on the expression, and seeing if they work.
In this case, float("inf")does work, so patsy decides that the Python expression inf is a numeric literal. Whoops.
The same thing probably happens if you try to use nan as a variable name in a formula.
I guess a more reliable way of checking for numeric literals would be to check the tokenize output: if an expression is a single token, and that token has type tokenize.NUMBER, then it's a numeric literal.
As noted here: https://stackoverflow.com/questions/48371747/how-to-modify-a-liner-regression-in-python-3-6
This formula causes patsy to raise an error:
the problem is that in
patsy.parse_formula._read_python_expr
, patsy tries to figure out whether an arbitrary Python expression is a numeric literal, and the way it does this is by callingint(...)
andfloat(...)
on the expression, and seeing if they work.In this case,
float("inf")
does work, so patsy decides that the Python expressioninf
is a numeric literal. Whoops.The same thing probably happens if you try to use
nan
as a variable name in a formula.I guess a more reliable way of checking for numeric literals would be to check the tokenize output: if an expression is a single token, and that token has type
tokenize.NUMBER
, then it's a numeric literal.