ibis-project / ibis

the portable Python dataframe library
https://ibis-project.org
Apache License 2.0
4.81k stars 574 forks source link

bug: SignatureValidationError from DerefMap during repeated joins #9561

Closed coroa closed 1 month ago

coroa commented 1 month ago

What happened?

t1 = ibis.memtable(
    np.array([[1, 3, 4, 5, 0], [1, 4, 5, 7, 1], [0, 3, 1, 5, 2], [0, 4, 2, 7, 3]]),
    columns=["i", "j", "lower", "upper", "label"],
)
t1v = t1.view()
t2 = t1.inner_join(t1v, ["i", "j"]).select("i", "j", coeff=1, var1=t1.label, var2=t1v.label)

t3 = ibis.memtable(np.array([[0, 2], [1, 3], [2, 5]]), columns=["i", "value"])
t2.inner_join(t3, ["i"])

is raising the following SignatureValidationError:

---------------------------------------------------------------------------
SignatureValidationError                  Traceback (most recent call last)
Cell In[85], line 9
      6 t2 = t1.inner_join(t1v, [\"i\", \"j\"]).select(\"i\", \"j\", coeff=1, var1=t1.label, var2=t1v.label)
      8 t3 = ibis.memtable(np.array([[0, 2], [1, 3], [2, 5]]), columns=[\"i\", \"value\"])
----> 9 t2.inner_join(t3, [\"i\"])

File ~/.local/conda/envs/linopy/lib/python3.12/site-packages/ibis/expr/types/relations.py:89, in _regular_join_method.<locals>.f(self, right, predicates, lname, rname)
     58 def f(  # noqa: D417
     59     self: ir.Table,
     60     right: ir.Table,
   (...)
     67     rname: str = \"{name}_right\",
     68 ) -> ir.Table:
     69     \"\"\"Perform a join between two tables.
     70 
     71     Parameters
   (...)
     87         Joined table
     88     \"\"\"
---> 89     return self.join(right, predicates, how=how, lname=lname, rname=rname)

File ~/.local/conda/envs/linopy/lib/python3.12/site-packages/ibis/expr/types/relations.py:3247, in Table.join(left, right, predicates, how, lname, rname)
   3091 \"\"\"Perform a join between two tables.
   3092 
   3093 Parameters
   (...)
   3243 └─────────┴───────────────────┴───────────────┴───────────────────┘
   3244 \"\"\"
   3245 from ibis.expr.types.joins import Join
-> 3247 return Join(left.op()).join(
   3248     right, predicates, how=how, lname=lname, rname=rname
   3249 )

File ~/.local/conda/envs/linopy/lib/python3.12/site-packages/ibis/expr/types/joins.py:254, in Join.join(self, right, predicates, how, lname, rname)
    252 # bind and dereference the predicates
    253 preds = prepare_predicates(chain, right, predicates)
--> 254 preds = flatten_predicates(preds)
    255 if not preds and how != \"cross\":
    256     # if there are no predicates, default to every row matching unless
    257     # the join is a cross join, because a cross join already has this
    258     # behavior
    259     preds.append(ops.Literal(True, dtype=\"bool\"))

File ~/.local/conda/envs/linopy/lib/python3.12/site-packages/ibis/expr/rewrites.py:191, in flatten_predicates(node)
    187     else:
    188         # halt and yield the node
    189         return False, node
--> 191 return list(traverse(predicate, node))

File ~/.local/conda/envs/linopy/lib/python3.12/site-packages/ibis/common/graph.py:630, in traverse(fn, node)
    616 def traverse(
    617     fn: Callable[[Node], tuple[bool | Iterable, Any]], node: Iterable[Node] | Node
    618 ) -> Iterator[Any]:
    619     \"\"\"Utility for generic expression tree traversal.
    620 
    621     Parameters
   (...)
    628 
    629     \"\"\"
--> 630     nodes = list(_flatten_collections(promote_list(node)))
    631     queue: deque[Node] = deque(reversed(nodes))
    632     seen: set[Node] = set()

File ~/.local/conda/envs/linopy/lib/python3.12/site-packages/ibis/util.py:110, in promote_list(val)
    108     return [val]
    109 elif is_iterable(val):
--> 110     return list(val)
    111 elif val is None:
    112     return []

File ~/.local/conda/envs/linopy/lib/python3.12/site-packages/ibis/expr/types/joins.py:162, in prepare_predicates(chain, right, predicates, comparison)
    160 reverse = {ops.Field(chain, k): v for k, v in chain.values.items()}
    161 deref_right = DerefMap.from_targets(right)
--> 162 deref_left = DerefMap.from_targets(chain.tables, extra=reverse)
    163 deref_both = DerefMap.from_targets([*chain.tables, right], extra=reverse)
    165 left, right = chain.to_expr(), right.to_expr()

File ~/.local/conda/envs/linopy/lib/python3.12/site-packages/ibis/expr/rewrites.py:102, in DerefMap.from_targets(cls, rels, extra)
     99 if extra is not None:
    100     subs.update(extra)
--> 102 return cls(rels, subs, ambigs)

File ~/.local/conda/envs/linopy/lib/python3.12/site-packages/ibis/common/bases.py:72, in AbstractMeta.__call__(cls, *args, **kwargs)
     52 def __call__(cls, *args, **kwargs):
     53     \"\"\"Create a new instance of the class.
     54 
     55     The subclass may override the `__create__` classmethod to change the
   (...)
     70 
     71     \"\"\"
---> 72     return cls.__create__(*args, **kwargs)

File ~/.local/conda/envs/linopy/lib/python3.12/site-packages/ibis/common/grounds.py:119, in Annotable.__create__(cls, *args, **kwargs)
    116 @classmethod
    117 def __create__(cls, *args: Any, **kwargs: Any) -> Self:
    118     # construct the instance by passing only validated keyword arguments
--> 119     kwargs = cls.__signature__.validate(cls, args, kwargs)
    120     return super().__create__(**kwargs)

File ~/.local/conda/envs/linopy/lib/python3.12/site-packages/ibis/common/annotations.py:497, in Signature.validate(self, func, args, kwargs)
    494         this[name] = result
    496 if errors:
--> 497     raise SignatureValidationError(
    498         \"{call} has failed due to the following errors:{errors}\
\
Expected signature: {sig}\",
    499         sig=self,
    500         func=func,
    501         args=args,
    502         kwargs=kwargs,
    503         errors=errors,
    504     )
    506 return this

SignatureValidationError: DerefMap([<ibis.expr.operations.relations.JoinReference object at 0x158ab5f70>, <ibis.expr.operations.relations.SelfReference object at 0x157f0b6b0>], {<ibis.expr.operations.relations.Field object at 0x157f2d240>: <ibis.expr.operations.relations.Field object at 0x157f2d240>, <ibis.expr.operations.relations.Field object at 0x1586f81a0>: <ibis.expr.operations.relations.Field object at 0x157f2d240>, <ibis.expr.operations.relations.Field object at 0x157f2f1c0>: <ibis.expr.operations.relations.Field object at 0x157f2f1c0>, <ibis.expr.operations.relations.Field object at 0x157f2d860>: <ibis.expr.operations.relations.Field object at 0x157f2f1c0>, <ibis.expr.operations.relations.Field object at 0x157f2de10>: <ibis.expr.operations.relations.Field object at 0x157f2de10>, <ibis.expr.operations.relations.Field object at 0x157f2e190>: <ibis.expr.operations.relations.Field object at 0x157f2de10>, <ibis.expr.operations.relations.Field object at 0x157f2e350>: <ibis.expr.operations.relations.Field object at 0x157f2e350>, <ibis.expr.operations.relations.Field object at 0x157f2cbb0>: <ibis.expr.operations.relations.Field object at 0x157f2e350>, <ibis.expr.operations.relations.Field object at 0x157f2ce50>: <ibis.expr.operations.relations.Field object at 0x157f2ce50>, <ibis.expr.operations.relations.Field object at 0x157f2c0c0>: <ibis.expr.operations.relations.Field object at 0x157f2ce50>, <ibis.expr.operations.relations.Field object at 0x157f2c2f0>: <ibis.expr.operations.relations.Field object at 0x157f2c2f0>, <ibis.expr.operations.relations.Field object at 0x157f2e270>: <ibis.expr.operations.relations.Field object at 0x157f2e270>, <ibis.expr.operations.relations.Field object at 0x157f2dbe0>: <ibis.expr.operations.relations.Field object at 0x157f2dbe0>, <ibis.expr.operations.relations.Field object at 0x157f2c4b0>: <ibis.expr.operations.relations.Field object at 0x157f2c4b0>, <ibis.expr.operations.relations.Field object at 0x157f2c360>: <ibis.expr.operations.relations.Field object at 0x157f2c360>, <ibis.expr.operations.relations.Field object at 0x157f2cb40>: <ibis.expr.operations.relations.Field object at 0x157f2e890>, <ibis.expr.operations.relations.Field object at 0x157f2c130>: <ibis.expr.operations.relations.Field object at 0x157f2ea50>, <ibis.expr.operations.relations.Field object at 0x157f2d160>: <ibis.expr.operations.generic.Literal object at 0x157eeaa50>, <ibis.expr.operations.relations.Field object at 0x157f2fa10>: <ibis.expr.operations.relations.Field object at 0x157f2e740>, <ibis.expr.operations.relations.Field object at 0x157f2f8c0>: <ibis.expr.operations.relations.Field object at 0x157f2c8a0>}, {}) has failed due to the following errors:
  `subs`: {<ibis.expr.operations.relations.Field object at 0x157f2d240>: <ibis.expr.operations.relations.Field object at 0x157f2d240>, <ibis.expr.operations.relations.Field object at 0x1586f81a0>: <ibis.expr.operations.relations.Field object at 0x157f2d240>, <ibis.expr.operations.relations.Field object at 0x157f2f1c0>: <ibis.expr.operations.relations.Field object at 0x157f2f1c0>, <ibis.expr.operations.relations.Field object at 0x157f2d860>: <ibis.expr.operations.relations.Field object at 0x157f2f1c0>, <ibis.expr.operations.relations.Field object at 0x157f2de10>: <ibis.expr.operations.relations.Field object at 0x157f2de10>, <ibis.expr.operations.relations.Field object at 0x157f2e190>: <ibis.expr.operations.relations.Field object at 0x157f2de10>, <ibis.expr.operations.relations.Field object at 0x157f2e350>: <ibis.expr.operations.relations.Field object at 0x157f2e350>, <ibis.expr.operations.relations.Field object at 0x157f2cbb0>: <ibis.expr.operations.relations.Field object at 0x157f2e350>, <ibis.expr.operations.relations.Field object at 0x157f2ce50>: <ibis.expr.operations.relations.Field object at 0x157f2ce50>, <ibis.expr.operations.relations.Field object at 0x157f2c0c0>: <ibis.expr.operations.relations.Field object at 0x157f2ce50>, <ibis.expr.operations.relations.Field object at 0x157f2c2f0>: <ibis.expr.operations.relations.Field object at 0x157f2c2f0>, <ibis.expr.operations.relations.Field object at 0x157f2e270>: <ibis.expr.operations.relations.Field object at 0x157f2e270>, <ibis.expr.operations.relations.Field object at 0x157f2dbe0>: <ibis.expr.operations.relations.Field object at 0x157f2dbe0>, <ibis.expr.operations.relations.Field object at 0x157f2c4b0>: <ibis.expr.operations.relations.Field object at 0x157f2c4b0>, <ibis.expr.operations.relations.Field object at 0x157f2c360>: <ibis.expr.operations.relations.Field object at 0x157f2c360>, <ibis.expr.operations.relations.Field object at 0x157f2cb40>: <ibis.expr.operations.relations.Field object at 0x157f2e890>, <ibis.expr.operations.relations.Field object at 0x157f2c130>: <ibis.expr.operations.relations.Field object at 0x157f2ea50>, <ibis.expr.operations.relations.Field object at 0x157f2d160>: <ibis.expr.operations.generic.Literal object at 0x157eeaa50>, <ibis.expr.operations.relations.Field object at 0x157f2fa10>: <ibis.expr.operations.relations.Field object at 0x157f2e740>, <ibis.expr.operations.relations.Field object at 0x157f2f8c0>: <ibis.expr.operations.relations.Field object at 0x157f2c8a0>} is not matching GenericMappingOf(key=CoercedTo(type=<class 'ibis.expr.operations.core.Value'>, func=<bound method Value.__coerce__ of <class 'ibis.expr.operations.core.Value'>>), value=CoercedTo(type=<class 'ibis.expr.operations.relations.Field'>, func=<bound method Value.__coerce__ of <class 'ibis.expr.operations.relations.Field'>>), type=CoercedTo(type=<class 'ibis.common.collections.FrozenDict'>, func=<class 'ibis.common.collections.FrozenDict'>))

Expected signature: DerefMap(rels: tuple[Relation, ...], subs: FrozenDict[Value, Field], ambigs: FrozenDict[Value, tuple[Value, ...]])"

Background

This arises in a prototype system that builds optimization models from algebraic equations.

t1 represents a vector variable with the two dimensions i and j, lower and upper bounds as well as a variable identifier (the matrix column of the constraint matrix in the end). Let's call this variable x for the time being.

The join for t2 is then equivalent to the quadratic expression that is built by the algreba: x * x. The expression is represented by a table with a var1 and a var2 column which hold the label identifiers and the coefficient of the quadratic term.

The third join would then be the multiplication of a pandas series with only the index i (so it should broadcasting across the j's of the expression in t2), which ultimately wants to scale the coeff values, but ibis seems to get confused when constructing this join (even though it should be relatively straightforward).

I am not sure how relevant that is to the bug report though.

What version of ibis are you using?

9.1.0

What backend(s) are you using, if any?

DuckDB

Relevant log output

No response

Code of Conduct

ncclementi commented 1 month ago

@coroa Can you give us a bit more context on what are you trying to achieve that triggers this.

coroa commented 1 month ago

Hi @ncclementi ,

puh, not really. This comes up relatively deep down in building an engine for writing linear optimization problems from mathematical equations with an xarray inspired syntax. To do automatic per-dimension broadcasting i am relying on a lot of sequential joins, and that is a minimal breaking example that i could distill after a lot of debugging.

I unfortunately don't understand the idea of the DerefMaps, so i am a bit unsure what is happening. I fear the self join of the t1 table is at the heart of the problem (and it will be difficult to avoid it in my higher level code), but i guess i could work around it, if i must.

(Added some more backgorund above)