pacman82 / arrow-odbc

Fill Apache Arrow record batches from an ODBC data source in Rust.
MIT License
52 stars 10 forks source link

An error occurs in epoch_to_timestamp: multiply with overflow. #113

Open yjhong79 opened 1 day ago

yjhong79 commented 1 day ago

The epoch_to_timestamp function can handle time units from seconds to nanoseconds.

As explained in the documentation below, the range for nanoseconds is between 1677-09-21T00:12:43.145224192 and 2262-04-11T23:47:16.854775807: https://docs.rs/chrono/latest/chrono/struct.DateTime.html#method.timestamp_nanos_opt

Inside the function, all values are forcibly converted to nanoseconds for processing. If a value is input that cannot be converted to nanoseconds (i.e., the computed value exceeds the i64 range), a multiply with overflow error occurs in debug mode. In release mode, this error is ignored, leading to incorrect calculations due to overflow.

Therefore, it seems necessary to handle inputs differently depending on their scale.

Below is test code to reproduce the error:

let overflow_time = DateTime::parse_from_rfc3339("1600-06-18T23:12:44.000Z").unwrap();
let time = epoch_to_timestamp::<1_000_000>(overflow_time.timestamp_micros());
println!("{:?}", time);  

// Release output:
// Timestamp { year: 1715, month: 11, day: 28, hour: 23, minute: 38, second: 10, fraction: 290448384 }
pacman82 commented 1 day ago

Hello @yjhong79 ,

thanks for reporting the bug!

Yeah indeed, I think you are correct. Luckily so far no one seems to work in a domain which requires dating historical artifacts with nanoseconds precision 😁 .

I'll handle it, but I am pretty swamped until the end of the month.

Best, Markus