pola-rs / polars

Dataframes powered by a multithreaded, vectorized query engine, written in Rust
https://docs.pola.rs
Other
29.29k stars 1.85k forks source link

`scan_csv`does not support a list of datatypes in `schema_overrides` #17813

Open nameexhaustion opened 1 month ago

nameexhaustion commented 1 month ago

Checks

Reproducible example

pl.scan_csv("https://j.mp/iriscsv", schema_overrides=4 * [pl.String]).collect()

Fails with TypeError: expected 'schema_overrides' dict, found 'list'

Log output

No response

Issue description

Needs to be fixed to allow dispatching read_csv to scan_csv

Expected behavior

Same output as read_csv:

pl.read_csv("https://j.mp/iriscsv", schema_overrides=4 * [pl.String])

shape: (150, 5)
┌──────────────┬─────────────┬──────────────┬─────────────┬───────────┐
│ sepal_length ┆ sepal_width ┆ petal_length ┆ petal_width ┆ species   │
│ ---          ┆ ---         ┆ ---          ┆ ---         ┆ ---       │
│ str          ┆ str         ┆ str          ┆ str         ┆ str       │
╞══════════════╪═════════════╪══════════════╪═════════════╪═══════════╡
│ 5.1          ┆ 3.5         ┆ 1.4          ┆ 0.2         ┆ setosa    │
│ 4.9          ┆ 3           ┆ 1.4          ┆ 0.2         ┆ setosa    │
│ 4.7          ┆ 3.2         ┆ 1.3          ┆ 0.2         ┆ setosa    │
│ 4.6          ┆ 3.1         ┆ 1.5          ┆ 0.2         ┆ setosa    │
│ 5            ┆ 3.6         ┆ 1.4          ┆ 0.2         ┆ setosa    │
│ …            ┆ …           ┆ …            ┆ …           ┆ …         │
│ 6.7          ┆ 3           ┆ 5.2          ┆ 2.3         ┆ virginica │
│ 6.3          ┆ 2.5         ┆ 5            ┆ 1.9         ┆ virginica │
│ 6.5          ┆ 3           ┆ 5.2          ┆ 2           ┆ virginica │
│ 6.2          ┆ 3.4         ┆ 5.4          ┆ 2.3         ┆ virginica │
│ 5.9          ┆ 3           ┆ 5.1          ┆ 1.8         ┆ virginica │
└──────────────┴─────────────┴──────────────┴─────────────┴───────────┘

Installed versions

1.2.1

alonme commented 2 weeks ago

I am taking a look 👍