pola-rs / polars

Dataframes powered by a multithreaded, vectorized query engine, written in Rust
https://docs.pola.rs
Other
27.52k stars 1.68k forks source link

Panic when importing parquet file #17175

Closed Chuck321123 closed 21 hours ago

Chuck321123 commented 4 days ago

Checks

Reproducible example

Polars bug

Log output

No response

Issue description

So I got this panic error, but I dont know how to replicate it. Would be nice if someone could find a fix to this.

Expected behavior

That it doesnt panic

Installed versions

``` --------Version info--------- Polars: 1.0.0-beta.1 Index type: UInt32 Platform: Linux-6.8.0-1005-raspi-aarch64-with-glibc2.39 Python: 3.12.3 | packaged by conda-forge | (main, Apr 15 2024, 18:17:49) [GCC 12.3.0] ----Optional dependencies---- adbc_driver_manager: cloudpickle: 3.0.0 connectorx: deltalake: fastexcel: fsspec: gevent: great_tables: hvplot: matplotlib: 3.9.0 nest_asyncio: 1.6.0 numpy: 1.26.4 openpyxl: 3.1.4 pandas: 2.2.2 pyarrow: 16.1.0 pydantic: pyiceberg: sqlalchemy: torch: xlsx2csv: xlsxwriter: ```
ritchie46 commented 4 days ago

Can you show the full backtrace with RUST_BACKTRACE=1 and share the parquet file?

Chuck321123 commented 4 days ago

@ritchie46 I turned on rust backtrace from now on, but the file has updated while i was asleep and is now working. It consists of 1 row with a datetime column and several float32 columns in its working state. About 306 columns. I'll let you know if I'm able to replicate the error.

Chuck321123 commented 4 days ago

@ritchie46 Put my read parquet function in a try-except loop before it failed. Here are some previous error messages: Polars bug

Chuck321123 commented 21 hours ago

After updating to v1.0.0-rc.2 I haven't encountered this error again. I'll reopen the case if it happens again.