pandas-dev / pandas

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
https://pandas.pydata.org
BSD 3-Clause "New" or "Revised" License
42.59k stars 17.57k forks source link

BUG: `DataFrame.eval` fails with TypeError with multiline expr but works when `eval` line by line #59062

Open messense opened 1 week ago

messense commented 1 week ago

Pandas version checks

Reproducible Example

import pandas as pd

df = pd.DataFrame({"first": [9.76, 9.76, 9.76], "last": [9.76, 9.76, 9.76], "pre": [9.75, 9.76, 9.76]})

expr = """first_ret = first / pre.fillna(first) - 1.0
last_ret = last / pre.fillna(first) - 1.0"""

# this fails with `TypeError: "value" parameter must be a scalar, dict or Series, but you passed a "ndarray"`
df.eval(expr)

# this works fine
for line in expr.splitlines():
    df.eval(line)

Issue Description

DataFrame.eval fails with TypeError: "value" parameter must be a scalar, dict or Series, but you passed a "ndarray" with multiline expr but works when eval line by line.

The culprit seems to be that https://github.com/pandas-dev/pandas/blob/a60ad39b4a9febdea9a59d602dad44b1538b0ea5/pandas/core/computation/align.py#L140 modified Scope.resolvers.

Expected Behavior

df.eval(expr) also works fine

Installed Versions

INSTALLED VERSIONS ------------------ commit : d9cdd2ee5a58015ef6f4d15c7226110c9aab8140 python : 3.10.12.final.0 python-bits : 64 OS : Linux OS-release : 6.5.0-1020-aws Version : #20~22.04.1-Ubuntu SMP Wed May 1 16:10:50 UTC 2024 machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : C.UTF-8 LOCALE : en_US.UTF-8 pandas : 2.2.2 numpy : 1.24.4 pytz : 2023.3 dateutil : 2.8.2 setuptools : 68.1.2 pip : 24.0 Cython : None pytest : 8.1.1 hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : None html5lib : None pymysql : None psycopg2 : 2.9.9 jinja2 : 3.1.2 IPython : 8.14.0 pandas_datareader : None adbc-driver-postgresql: None adbc-driver-sqlite : None bs4 : None bottleneck : 1.3.7 dataframe-api-compat : None fastparquet : None fsspec : 2023.6.0 gcsfs : None matplotlib : 3.7.2 numba : 0.57.1 numexpr : 2.10.1 odfpy : None openpyxl : 3.1.2 pandas_gbq : None pyarrow : 16.0.0 pyreadstat : None python-calamine : None pyxlsb : None s3fs : 2023.6.0 scipy : 1.12.0 sqlalchemy : 2.0.20 tables : None tabulate : 0.9.0 xarray : 2023.8.0 xlrd : 2.0.1 zstandard : 0.22.0 tzdata : 2023.3 qtpy : None pyqt5 : None