Closed gipert closed 3 weeks ago
Is it an error message saying that it's not supported?
If Pandas is just running Python eval
with column names loaded into the namespace (as the documentation suggests), then I don't see why those strings couldn't operate directly on Awkward Arrays.
Maybe the AwkwardSeries objects need to be unwrapped before passing to eval
and the result needs to be re-wrapped? (This is a question for @douglasdavis.)
I don't see why those strings couldn't operate directly on Awkward Arrays.
Indeed that was my thinking. This is what happens:
>>> import awkward_pandas as akpd
>>> import pandas as pd
>>> import awkward as ak
>>> df = pd.DataFrame(
...: {
...: "a": [1, 2, 3, 4],
...: "b": akpd.from_awkward(ak.Array([[1, 2], [], [3], [4, 5, 6]]))
...: }
...: )
>>> df.eval("b*2")
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[10], line 1
----> 1 df.eval("b*2")
File ~/.virtualenvs/legend/lib/python3.11/site-packages/pandas/core/frame.py:4566, in DataFrame.eval(self, expr, inplace, **kwargs)
4563 kwargs["target"] = self
4564 kwargs["resolvers"] = tuple(kwargs.get("resolvers", ())) + resolvers
-> 4566 return _eval(expr, inplace=inplace, **kwargs)
File ~/.virtualenvs/legend/lib/python3.11/site-packages/pandas/core/computation/eval.py:336, in eval(expr, parser, engine, local_dict, global_dict, resolvers, level, target, inplace)
327 # get our (possibly passed-in) scope
328 env = ensure_scope(
329 level + 1,
330 global_dict=global_dict,
(...)
333 target=target,
334 )
--> 336 parsed_expr = Expr(expr, engine=engine, parser=parser, env=env)
338 if engine == "numexpr" and (
339 is_extension_array_dtype(parsed_expr.terms.return_type)
340 or getattr(parsed_expr.terms, "operand_types", None) is not None
(...)
344 )
345 ):
346 warnings.warn(
347 "Engine has switched to 'python' because numexpr does not support "
348 "extension array dtypes. Please set your engine to python manually.",
349 RuntimeWarning,
350 stacklevel=find_stack_level(),
351 )
File ~/.virtualenvs/legend/lib/python3.11/site-packages/pandas/core/computation/expr.py:809, in Expr.__init__(self, expr, engine, parser, env, level)
807 self.parser = parser
808 self._visitor = PARSERS[parser](self.env, self.engine, self.parser)
--> 809 self.terms = self.parse()
File ~/.virtualenvs/legend/lib/python3.11/site-packages/pandas/core/computation/expr.py:828, in Expr.parse(self)
824 def parse(self):
825 """
826 Parse an expression.
827 """
--> 828 return self._visitor.visit(self.expr)
File ~/.virtualenvs/legend/lib/python3.11/site-packages/pandas/core/computation/expr.py:415, in BaseExprVisitor.visit(self, node, **kwargs)
413 method = f"visit_{type(node).__name__}"
414 visitor = getattr(self, method)
--> 415 return visitor(node, **kwargs)
File ~/.virtualenvs/legend/lib/python3.11/site-packages/pandas/core/computation/expr.py:421, in BaseExprVisitor.visit_Module(self, node, **kwargs)
419 raise SyntaxError("only a single expression is allowed")
420 expr = node.body[0]
--> 421 return self.visit(expr, **kwargs)
File ~/.virtualenvs/legend/lib/python3.11/site-packages/pandas/core/computation/expr.py:415, in BaseExprVisitor.visit(self, node, **kwargs)
413 method = f"visit_{type(node).__name__}"
414 visitor = getattr(self, method)
--> 415 return visitor(node, **kwargs)
File ~/.virtualenvs/legend/lib/python3.11/site-packages/pandas/core/computation/expr.py:424, in BaseExprVisitor.visit_Expr(self, node, **kwargs)
423 def visit_Expr(self, node, **kwargs):
--> 424 return self.visit(node.value, **kwargs)
File ~/.virtualenvs/legend/lib/python3.11/site-packages/pandas/core/computation/expr.py:415, in BaseExprVisitor.visit(self, node, **kwargs)
413 method = f"visit_{type(node).__name__}"
414 visitor = getattr(self, method)
--> 415 return visitor(node, **kwargs)
File ~/.virtualenvs/legend/lib/python3.11/site-packages/pandas/core/computation/expr.py:537, in BaseExprVisitor.visit_BinOp(self, node, **kwargs)
535 op, op_class, left, right = self._maybe_transform_eq_ne(node)
536 left, right = self._maybe_downcast_constants(left, right)
--> 537 return self._maybe_evaluate_binop(op, op_class, left, right)
File ~/.virtualenvs/legend/lib/python3.11/site-packages/pandas/core/computation/expr.py:507, in BaseExprVisitor._maybe_evaluate_binop(self, op, op_class, lhs, rhs, eval_in_python, maybe_eval_in_python)
504 res = op(lhs, rhs)
506 if res.has_invalid_return_type:
--> 507 raise TypeError(
508 f"unsupported operand type(s) for {res.op}: "
509 f"'{lhs.type}' and '{rhs.type}'"
510 )
512 if self.engine != "pytables" and (
513 res.op in CMP_OPS_SYMS
514 and getattr(lhs, "is_datetime", False)
(...)
517 # all date ops must be done in python bc numexpr doesn't work
518 # well with NaT
519 return self._maybe_eval(res, self.binary_ops)
TypeError: unsupported operand type(s) for *: 'awkward' and '<class 'int'>'
Just for the context: I'm writing some code to evaluate algebraic expressions from a config file on tables made by jagged and rectangular columns. I was hoping to be able to write almost no code by using pandas.eval()
...
I've just ran a quick test and it seems to me that
pandas.DataFrame.eval()
is not supported. Is that correct? Is there a way to evaluate string expressions on dataframes containing Awkward arrays?