Open mcrumiller opened 6 months ago
Which driver are you using to connect? Can I see the alchemy URI prefix? (Could be odbc or native, and they may have different behaviour)
I'm using pymssql
for the MSSQL connection (uri mssql+pymssql://server/database
).
However, I just checked on postgres (postgresql://user:pass@server:5432/database
) and the same thing occurs:
# retrieves proper dtype
print(pl.read_database_uri(sql, uri))
# shape: (1, 5)
# ┌─────────┬───────────┬──────────┬─────────────┬──────────┐
# │ int_col ┆ float_col ┆ char_col ┆ varchar_col ┆ date_col │
# │ --- ┆ --- ┆ --- ┆ --- ┆ --- │
# │ i32 ┆ f64 ┆ str ┆ str ┆ date │
# ╞═════════╪═══════════╪══════════╪═════════════╪══════════╡
# │ null ┆ null ┆ null ┆ null ┆ null │
# └─────────┴───────────┴──────────┴─────────────┴──────────┘
# returns null dtype
print(pl.read_database(sql, conn))
# shape: (1, 5)
# ┌─────────┬───────────┬──────────┬─────────────┬──────────┐
# │ int_col ┆ float_col ┆ char_col ┆ varchar_col ┆ date_col │
# │ --- ┆ --- ┆ --- ┆ --- ┆ --- │
# │ null ┆ null ┆ null ┆ null ┆ null │
# ╞═════════╪═══════════╪══════════╪═════════════╪══════════╡
# │ null ┆ null ┆ null ┆ null ┆ null │
# └─────────┴───────────┴──────────┴─────────────┴──────────┘
Ok, I know what this is - I haven't got around to doing direct driver module introspection to back-out custom cursor description type codes to further-infer Polars dtypes yet. At the moment only standard python types or type strings get inferred out of the cursor result object; custom driver-specific type codes aren't handled yet, so if you have no data (or no typed data - eg: all null) we can't determine the column dtype.
It's something of a nightmare (_"welcome to the wild, wild world of the DBAPI2 type_code
attribute"_), but I've done it before in my "real job" so I know where the bodies are buried ;)
This is probably a rare case that I pre-empted into my testing anyway. I appreciate the help though!
This is probably a rare case that I pre-empted into my testing anyway. I appreciate the help though!
I'm actually most of the way to implementing a Polars-specific driver module type_code
reverse-lookup/translation; expect a PR to land shortly! ✌️ #blackmagic
Checks
Issue Description
When
pl.read_database
is used and all values are null in a SQL column, apl.Null
column is returned, regardless of the data type in the database (example here with MSSQL). Whenread_database_uri
is used, the proper dtype is returned. This may only apply to sqlalchemy connections.In this example, I have five columns with different dtypes each with a single null value. I've provided the setup below to help reproduce the example.
with
read_database_uri
: correct dtype returnedwith
read_database
/sqlalchemy
: incorrect dtype returnedSetup
Create a table with different data types, with only null values
Check values:
We have five columns each with a single null value.
Installed versions