pola-rs / polars

Dataframes powered by a multithreaded, vectorized query engine, written in Rust
https://docs.pola.rs
Other
29.13k stars 1.83k forks source link

Polars panics when trying to read ndjson from a file-like object #13273

Closed jamesperez2005 closed 2 months ago

jamesperez2005 commented 8 months ago

Checks

Reproducible example

df = pl.DataFrame(
    {
        'x': [1, 2, 3],
        'y': ['a', 'b', 'c'],
    }
)
df.write_ndjson("/tmp/testme.ndjson")
pl.read_ndjson(open("/tmp/testme.ndjson"))

Log output

No response

Issue description

It is very useful to be able to read ndjson from file objects, e.g. when reading compressed data

Expected behavior

The dataframe should be successfully read

Installed versions

``` --------Version info--------- Polars: 0.20.2 Index type: UInt32 Platform: macOS-14.2.1-arm64-arm-64bit Python: 3.8.13 (default, Jun 26 2022, 15:51:52) [Clang 13.1.6 (clang-1316.0.21.2.5)] ----Optional dependencies---- adbc_driver_manager: cloudpickle: connectorx: deltalake: fsspec: 2022.10.0 gevent: matplotlib: 3.6.1 numpy: 1.23.4 openpyxl: 3.1.2 pandas: 1.5.1 pyarrow: 9.0.0 pydantic: pyiceberg: pyxlsb: sqlalchemy: 1.4.42 xlsx2csv: xlsxwriter: ```
jamesperez2005 commented 8 months ago

Same reproduces with read_json/write_json, BTW