ibis-project / ibis

the portable Python dataframe library
https://ibis-project.org
Apache License 2.0
5.37k stars 603 forks source link

bug: StructFilter::ToExpression not yet supported #9667

Closed ncclementi closed 4 months ago

ncclementi commented 4 months ago

What happened?

When trying to run the following code (xref: https://github.com/ibis-project/ibis/issues/9662)

>>> url = "s3://overturemaps-us-west-2/release/2024-06-13-beta.1/theme=buildings/type=*/*"
>>> t = con.read_parquet(url, table_name="buildings",
                 **{"filename": True, 
                   "hive_partitioning": 1}).select(
                    _.id,
                    _.height,
                    _.geometry,
                    _.bbox,
                    primary_names=_.names.primary).filter(
                    _.primary_names.notnull(), 
                    _.bbox.xmin > ibis.literal(-84.36, "float32"),
                    _.bbox.xmax < ibis.literal(-82.42, "float32"),
                    _.bbox.ymin > ibis.literal(41.71, "float32"),
                    _.bbox.ymax < ibis.literal(43.33, "float32")).limit(10)
>>>t.execute()  

I get

---------------------------------------------------------------------------
NotImplementedException                   Traceback (most recent call last)
Cell In[9], line 1
----> 1 expr.execute()

File [~/Documents/git/my_forks/ibis/ibis/expr/types/core.py:393](http://localhost:8888/lab/workspaces/auto-s/tree/~/Documents/git/my_forks/ibis/ibis/expr/types/core.py#line=392), in Expr.execute(self, limit, params, **kwargs)
    375 def execute(
    376     self,
    377     limit: int | str | None = "default",
    378     params: Mapping[ir.Value, Any] | None = None,
    379     **kwargs: Any,
    380 ):
    381     """Execute an expression against its backend if one exists.
    382 
    383     Parameters
   (...)
    391         Keyword arguments
    392     """
--> 393     return self._find_backend(use_default=True).execute(
    394         self, limit=limit, params=params, **kwargs
    395     )

File [~/Documents/git/my_forks/ibis/ibis/backends/duckdb/__init__.py:1384](http://localhost:8888/lab/workspaces/auto-s/tree/~/Documents/git/my_forks/ibis/ibis/backends/duckdb/__init__.py#line=1383), in Backend.execute(self, expr, params, limit, **_)
   1381 import pandas as pd
   1382 import pyarrow.types as pat
-> 1384 table = self._to_duckdb_relation(expr, params=params, limit=limit).arrow()
   1386 df = pd.DataFrame(
   1387     {
   1388         name: (
   (...)
   1400     }
   1401 )
   1402 df = DuckDBPandasData.convert_table(df, expr.as_table().schema())

NotImplementedException: Not implemented Error: StructFilter::ToExpression not yet supported

I thought this was because I ad the _.bbox in the select which is a struct column. But when I tried to invert the order and to the filter first and then the select to avoid that, like:

>>> expr2 = t.filter(
                _.names.primary.notnull(),
                _.bbox.xmin > ibis.literal(-84.36, "float32"),
                _.bbox.xmax < ibis.literal(-82.42, "float32"),
                _.bbox.ymin > ibis.literal(41.71, "float32"),
                _.bbox.ymax < ibis.literal(43.33, "float32"),
                ).select(_.id, _.height, _.geometry, 
                primary_names=_.names.primary).limit(5)
>>> expr2.execute()           

I get the same error.

What version of ibis are you using?

main

What backend(s) are you using, if any?

duckdb

Relevant log output

No response

Code of Conduct

cpcloud commented 4 months ago

What version of duckdb are you running? I don't see this with 1.0.0. If you're using any custom built extensions that could also be the issue.

This looks reproducible without Ibis, given that the error message is coming from duckdb and it's saying it doesn't support some piece of functionality.

Can you grab the SQL produced by Ibis and create an upstream bug report for them?

ncclementi commented 4 months ago

I upgraded to the latest nightly, and I can reproduce with just duckdb, sorry for the noise, closing. xref upstream: https://github.com/duckdb/duckdb/issues/13120

cpcloud commented 4 months ago

No worries, thanks for reporting upstream!