pola-rs / polars

Dataframes powered by a multithreaded, vectorized query engine, written in Rust
https://docs.pola.rs
Other
29.12k stars 1.83k forks source link

"expected <datatype>, got <datatype>" - use datatype display repr consistently? #14717

Open MarcoGorelli opened 6 months ago

MarcoGorelli commented 6 months ago

Checks

Reproducible example

Here's a minor thing that bugs me

If you do

s.i64()?

but s was backed by Float64, then you'll get

invalid series dtype: expected `Int64`, got `f64`

Example:

df = pl.DataFrame({'a': [datetime(2020, 1, 1)]})
df.select(pl.col('a').dt.truncate(pl.col('a')))
---------------------------------------------------------------------------

SchemaError: invalid series dtype: expected `String`, got `datetime[μs]`

Log output

No response

Issue description

One uses the name of the datatype, the other uses display of the datatype

Expected behavior

For the examples above:

expected `i64`, got `f64````

and

SchemaError: invalid series dtype: expected str, got datetime[μs]


I think it'd be good to consistently use the DataType Display one, as that's the one people see when they print dataframes

### Installed versions

<details>

--------Version info--------- Polars: 0.20.10 Index type: UInt32 Platform: Linux-5.15.133.1-microsoft-standard-WSL2-x86_64-with-glibc2.35 Python: 3.11.7 (main, Dec 8 2023, 18:56:58) [GCC 11.4.0]

----Optional dependencies---- adbc_driver_manager: cloudpickle: connectorx: deltalake: fsspec: gevent: hvplot: matplotlib: numpy: 1.26.4 openpyxl: pandas: 2.2.0 pyarrow: 15.0.0 pydantic: 2.6.1 pyiceberg: pyxlsb: sqlalchemy: xlsx2csv: xlsxwriter: ```

stinodego commented 6 months ago

Related: https://github.com/pola-rs/polars/issues/13904