delta-io / delta-rs

A native Rust library for Delta Lake, with bindings into Python
https://delta-io.github.io/delta-rs/
Apache License 2.0
1.98k stars 365 forks source link

`write_deltalake` throws parser error when using `rust` engine and big decimals #2510

Closed jorritsandbrink closed 3 weeks ago

jorritsandbrink commented 1 month ago

Environment

Delta-rs version: 0.17.4

Binding: Python

Environment:


Bug

What happened: The following error was thrown on calling write_deltalake using the rust engine with a decimal value that is larger than 16 digits:

Exception: Parser error: can't parse the string value 1.1111111111111112e16 to decimal

This error does not occur when using the pyarrow engine. This error does not occur with decimal values that are 16 digits or less.

What you expected to happen: The table got written without error.

How to reproduce it:

from decimal import Decimal
import pyarrow as pa
import deltalake
from deltalake import write_deltalake

assert deltalake.__version__ == "0.17.4"

big_decimal = Decimal(11111111111111111) # 17 digits
data = {"decimal_column": pa.array([big_decimal])}
arrow_table = pa.table(data)
write_deltalake(tmp_path, arrow_table, engine="rust")  # throws parser error

More details: Perhaps related to #1778, #2193, #2221. I opened a new issue because this bug is rust-engine specific, while the others (seemingly) aren't.

ion-elgreco commented 1 month ago

It's a known issue, it will be resolved when we upgrade the arrow crates

rtyler commented 1 month ago

@ion-elgreco when you have a chance can you link to the arrow issue? That would be handy to have lying around :smile:

ion-elgreco commented 1 month ago

@rtyler this is the issue I created in arrow-rs: https://github.com/apache/arrow-rs/issues/5549, resolved by this PR: https://github.com/apache/arrow-rs/pull/5611

nixent commented 1 month ago

It's a known issue, it will be resolved when we upgrade the arrow crates

@ion-elgreco would it be possible to update the crates with next release?

ion-elgreco commented 1 month ago

@nixent no, we are waiting on the next release of arrow-rs