googleapis / python-bigquery-sqlalchemy

SQLAlchemy dialect for BigQuery
MIT License
436 stars 130 forks source link

in_ Operator does not work for column type NUMERIC #1126

Open pforero opened 1 month ago

pforero commented 1 month ago

The command IN UNNEST does not work for columns with type NUMERIC the same way it works with INT64 or FLOAT64.

The in_ operator on python-bigquery-sqlalchemy uses the IN UNNEST(@PARAM1) syntax. Therefore it is unable to run an in operation for columns with the NUMERIC type.

Note: IN (@PARAM_1) works well with NUMERIC. And when compiling with compile_kwargs "literal_binds"=True it uses that syntax. But for normal compile it uses IN UNNEST.

Environment details

Steps to reproduce

  1. Create a sample BigQuery table with different types. Go to the the consoles, editor and run the code example.
  2. Run in the console different versions on the test query to see what works and doesn't work.
  3. Populate the table with some values.
  4. Run an SQLAlchemy query with the in_ operator on the numeric column and it will raise an error.

Code example

--Create table for step 1
CREATE TABLE my_dataset.test_table(
  numeric_col NUMERIC,
  int_col INT64,
  float_col FLOAT64
)
--Tests for syntax that works and doesn't work with numeric
SELECT
  COUNT(*)
FROM my_dataset.test_table`
WHERE
  numeric_col IN UNNEST([1]) -- Doesn't work
  int_col IN UNNEST([1]) -- Works
  numeric_col IN (1) -- Works
  numeric_col IN UNNEST([1.1]) -- Doesn't work
  float_col IN UNNEST([1.1]) -- Works
  numeric_col IN (1.1) -- Works
--Populate the table with some values
INSERT my_dataset.test_table (numeric_col , int_col , float_col)
VALUES(1, 1, 1.1), (2.2, 2, 2.2)
# Set-up the ORM of the table
from sqlalchemy import create_engine, select
from sqlalchemy.orm import DeclarativeBase, Mapped, Session, mapped_column

class Base(DeclarativeBase):
    pass
class TestTable(Base):
    __tablename__ = "test_table"
    __table_args__ = {"schema": "my_dataset"}
    numeric_col: Mapped[float] = mapped_column(primary_key=True)
    int_col: Mapped[int]
    float_col: Mapped[float]

bq = create_engine("bigquery://")
conn = bq.connect()
session = Session(conn)

stmt_numeric = select(TestTable).where(TestTable.numeric_col.in_([1]))
stmt_int = select(TestTable).where(TestTable.int_col.in_([1]))
stmt_float = select(TestTable).where(TestTable.float_col.in_([1.1]))

session.execute(stmt_int) # Works
session.execute(stmt_float) # Works
session.execute(stmt_numeric) # Doesn't work

Seeing as the syntax

WHERE numeric_col IN (1)

actually works it is counter intuitive seeing it doesn't work with sqlalchame-bigquery.

Thanks!