Closed andrei-ionescu closed 2 years ago
I did two more tests by saving the dimension_load_date
column in the parquet as:
int64
with TIMSTAMP_MICROS
-- error 🔴 int64
with TIMESTAMP_MILLIS
-- error 🔴 I got a similar overflow panic error:
thread 'tokio-runtime-worker' panicked at 'attempt to multiply with overflow',
/rustc/65c55bf931a55e6b1e5ed14ad8623814a7386424/library/core/src/ops/arith.rs:344:1
The complete error output:
thread 'tokio-runtime-worker' panicked at 'attempt to multiply with overflow', /rustc/65c55bf931a55e6b1e5ed14ad8623814a7386424/library/core/src/ops/arith.rs:344:1
stack backtrace:
0: rust_begin_unwind
at /rustc/65c55bf931a55e6b1e5ed14ad8623814a7386424/library/std/src/panicking.rs:498:5
1: core::panicking::panic_fmt
at /rustc/65c55bf931a55e6b1e5ed14ad8623814a7386424/library/core/src/panicking.rs:107:14
2: core::panicking::panic
at /rustc/65c55bf931a55e6b1e5ed14ad8623814a7386424/library/core/src/panicking.rs:48:5
3: <i64 as core::ops::arith::Mul>::mul
at /rustc/65c55bf931a55e6b1e5ed14ad8623814a7386424/library/core/src/ops/arith.rs:337:45
4: arrow::compute::kernels::arithmetic::multiply::{{closure}}
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/arrow-6.2.0/src/compute/kernels/arithmetic.rs:1070:40
5: arrow::compute::kernels::arithmetic::math_op::{{closure}}
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/arrow-6.2.0/src/compute/kernels/arithmetic.rs:181:23
6: core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &mut F>::call_once
at /rustc/65c55bf931a55e6b1e5ed14ad8623814a7386424/library/core/src/ops/function.rs:280:13
7: core::option::Option<T>::map
at /rustc/65c55bf931a55e6b1e5ed14ad8623814a7386424/library/core/src/option.rs:846:29
8: <core::iter::adapters::map::Map<I,F> as core::iter::traits::iterator::Iterator>::next
at /rustc/65c55bf931a55e6b1e5ed14ad8623814a7386424/library/core/src/iter/adapters/map.rs:103:9
9: arrow::buffer::mutable::MutableBuffer::from_trusted_len_iter
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/arrow-6.2.0/src/buffer/mutable.rs:437:21
10: arrow::buffer::immutable::Buffer::from_trusted_len_iter
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/arrow-6.2.0/src/buffer/immutable.rs:282:9
11: arrow::compute::kernels::arithmetic::math_op
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/arrow-6.2.0/src/compute/kernels/arithmetic.rs:187:27
12: arrow::compute::kernels::arithmetic::multiply
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/arrow-6.2.0/src/compute/kernels/arithmetic.rs:1070:12
13: arrow::compute::kernels::cast::cast_with_options
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/arrow-6.2.0/src/compute/kernels/cast.rs:941:17
14: datafusion::physical_plan::expressions::cast::cast_column
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/datafusion-6.0.0/src/physical_plan/expressions/cast.rs:106:13
15: <datafusion::physical_plan::expressions::cast::CastExpr as datafusion::physical_plan::PhysicalExpr>::evaluate
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/datafusion-6.0.0/src/physical_plan/expressions/cast.rs:94:9
16: datafusion::physical_plan::projection::ProjectionStream::batch_project::{{closure}}
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/datafusion-6.0.0/src/physical_plan/projection.rs:212:25
17: core::iter::adapters::map::map_try_fold::{{closure}}
at /rustc/65c55bf931a55e6b1e5ed14ad8623814a7386424/library/core/src/iter/adapters/map.rs:91:28
18: core::iter::traits::iterator::Iterator::try_fold
at /rustc/65c55bf931a55e6b1e5ed14ad8623814a7386424/library/core/src/iter/traits/iterator.rs:1995:21
19: <core::iter::adapters::map::Map<I,F> as core::iter::traits::iterator::Iterator>::try_fold
at /rustc/65c55bf931a55e6b1e5ed14ad8623814a7386424/library/core/src/iter/adapters/map.rs:117:9
20: <core::iter::adapters::map::Map<I,F> as core::iter::traits::iterator::Iterator>::try_fold
at /rustc/65c55bf931a55e6b1e5ed14ad8623814a7386424/library/core/src/iter/adapters/map.rs:117:9
21: <core::iter::adapters::ResultShunt<I,E> as core::iter::traits::iterator::Iterator>::try_fold
at /rustc/65c55bf931a55e6b1e5ed14ad8623814a7386424/library/core/src/iter/adapters/mod.rs:178:9
22: core::iter::traits::iterator::Iterator::find
at /rustc/65c55bf931a55e6b1e5ed14ad8623814a7386424/library/core/src/iter/traits/iterator.rs:2383:9
23: <core::iter::adapters::ResultShunt<I,E> as core::iter::traits::iterator::Iterator>::next
at /rustc/65c55bf931a55e6b1e5ed14ad8623814a7386424/library/core/src/iter/adapters/mod.rs:160:9
24: alloc::vec::Vec<T,A>::extend_desugared
at /rustc/65c55bf931a55e6b1e5ed14ad8623814a7386424/library/alloc/src/vec/mod.rs:2646:35
25: <alloc::vec::Vec<T,A> as alloc::vec::spec_extend::SpecExtend<T,I>>::spec_extend
at /rustc/65c55bf931a55e6b1e5ed14ad8623814a7386424/library/alloc/src/vec/spec_extend.rs:18:9
26: <alloc::vec::Vec<T> as alloc::vec::spec_from_iter_nested::SpecFromIterNested<T,I>>::from_iter
at /rustc/65c55bf931a55e6b1e5ed14ad8623814a7386424/library/alloc/src/vec/spec_from_iter_nested.rs:37:9
27: <alloc::vec::Vec<T> as alloc::vec::spec_from_iter::SpecFromIter<T,I>>::from_iter
at /rustc/65c55bf931a55e6b1e5ed14ad8623814a7386424/library/alloc/src/vec/spec_from_iter.rs:33:9
28: <alloc::vec::Vec<T> as core::iter::traits::collect::FromIterator<T>>::from_iter
at /rustc/65c55bf931a55e6b1e5ed14ad8623814a7386424/library/alloc/src/vec/mod.rs:2549:9
29: core::iter::traits::iterator::Iterator::collect
at /rustc/65c55bf931a55e6b1e5ed14ad8623814a7386424/library/core/src/iter/traits/iterator.rs:1745:9
30: <core::result::Result<V,E> as core::iter::traits::collect::FromIterator<core::result::Result<A,E>>>::from_iter::{{closure}}
at /rustc/65c55bf931a55e6b1e5ed14ad8623814a7386424/library/core/src/result.rs:1883:53
31: core::iter::adapters::process_results
at /rustc/65c55bf931a55e6b1e5ed14ad8623814a7386424/library/core/src/iter/adapters/mod.rs:149:17
32: <core::result::Result<V,E> as core::iter::traits::collect::FromIterator<core::result::Result<A,E>>>::from_iter
at /rustc/65c55bf931a55e6b1e5ed14ad8623814a7386424/library/core/src/result.rs:1883:9
33: core::iter::traits::iterator::Iterator::collect
at /rustc/65c55bf931a55e6b1e5ed14ad8623814a7386424/library/core/src/iter/traits/iterator.rs:1745:9
34: datafusion::physical_plan::projection::ProjectionStream::batch_project
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/datafusion-6.0.0/src/physical_plan/projection.rs:210:9
35: <datafusion::physical_plan::projection::ProjectionStream as futures_core::stream::Stream>::poll_next::{{closure}}
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/datafusion-6.0.0/src/physical_plan/projection.rs:238:37
36: core::task::poll::Poll<T>::map
at /rustc/65c55bf931a55e6b1e5ed14ad8623814a7386424/library/core/src/task/poll.rs:52:43
37: <datafusion::physical_plan::projection::ProjectionStream as futures_core::stream::Stream>::poll_next
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/datafusion-6.0.0/src/physical_plan/projection.rs:237:20
38: <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/futures-core-0.3.18/src/stream.rs:120:9
39: futures_util::stream::stream::StreamExt::poll_next_unpin
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/futures-util-0.3.18/src/stream/stream/mod.rs:1474:9
40: <futures_util::stream::stream::next::Next<St> as core::future::future::Future>::poll
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/futures-util-0.3.18/src/stream/stream/next.rs:32:9
41: datafusion::physical_plan::common::spawn_execution::{{closure}}
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/datafusion-6.0.0/src/physical_plan/common.rs:179:32
42: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
at /rustc/65c55bf931a55e6b1e5ed14ad8623814a7386424/library/core/src/future/mod.rs:80:19
43: tokio::runtime::task::core::CoreStage<T>::poll::{{closure}}
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.14.0/src/runtime/task/core.rs:161:17
44: tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.14.0/src/loom/std/unsafe_cell.rs:14:9
45: tokio::runtime::task::core::CoreStage<T>::poll
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.14.0/src/runtime/task/core.rs:151:13
46: tokio::runtime::task::harness::poll_future::{{closure}}
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.14.0/src/runtime/task/harness.rs:461:19
47: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
at /rustc/65c55bf931a55e6b1e5ed14ad8623814a7386424/library/core/src/panic/unwind_safe.rs:271:9
48: std::panicking::try::do_call
at /rustc/65c55bf931a55e6b1e5ed14ad8623814a7386424/library/std/src/panicking.rs:406:40
49: <unknown>
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/datafusion-6.0.0/src/physical_plan/distinct_expressions.rs:127:15
50: std::panicking::try
at /rustc/65c55bf931a55e6b1e5ed14ad8623814a7386424/library/std/src/panicking.rs:370:19
51: std::panic::catch_unwind
at /rustc/65c55bf931a55e6b1e5ed14ad8623814a7386424/library/std/src/panic.rs:133:14
52: tokio::runtime::task::harness::poll_future
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.14.0/src/runtime/task/harness.rs:449:18
53: tokio::runtime::task::harness::Harness<T,S>::poll_inner
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.14.0/src/runtime/task/harness.rs:98:27
54: tokio::runtime::task::harness::Harness<T,S>::poll
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.14.0/src/runtime/task/harness.rs:53:15
55: tokio::runtime::task::raw::poll
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.14.0/src/runtime/task/raw.rs:113:5
56: tokio::runtime::task::raw::RawTask::poll
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.14.0/src/runtime/task/raw.rs:70:18
57: tokio::runtime::task::LocalNotified<S>::run
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.14.0/src/runtime/task/mod.rs:343:9
58: tokio::runtime::thread_pool::worker::Context::run_task::{{closure}}
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.14.0/src/runtime/thread_pool/worker.rs:443:21
59: tokio::coop::with_budget::{{closure}}
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.14.0/src/coop.rs:106:9
60: std::thread::local::LocalKey<T>::try_with
at /rustc/65c55bf931a55e6b1e5ed14ad8623814a7386424/library/std/src/thread/local.rs:399:16
61: std::thread::local::LocalKey<T>::with
at /rustc/65c55bf931a55e6b1e5ed14ad8623814a7386424/library/std/src/thread/local.rs:375:9
62: tokio::coop::with_budget
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.14.0/src/coop.rs:99:5
63: tokio::coop::budget
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.14.0/src/coop.rs:76:5
64: tokio::runtime::thread_pool::worker::Context::run_task
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.14.0/src/runtime/thread_pool/worker.rs:419:9
65: tokio::runtime::thread_pool::worker::Context::run
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.14.0/src/runtime/thread_pool/worker.rs:386:24
66: tokio::runtime::thread_pool::worker::run::{{closure}}
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.14.0/src/runtime/thread_pool/worker.rs:371:17
67: tokio::macros::scoped_tls::ScopedKey<T>::set
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.14.0/src/macros/scoped_tls.rs:61:9
68: tokio::runtime::thread_pool::worker::run
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.14.0/src/runtime/thread_pool/worker.rs:368:5
69: tokio::runtime::thread_pool::worker::Launch::launch::{{closure}}
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.14.0/src/runtime/thread_pool/worker.rs:347:45
70: <tokio::runtime::blocking::task::BlockingTask<T> as core::future::future::Future>::poll
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.14.0/src/runtime/blocking/task.rs:42:21
71: tokio::runtime::task::core::CoreStage<T>::poll::{{closure}}
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.14.0/src/runtime/task/core.rs:161:17
72: tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.14.0/src/loom/std/unsafe_cell.rs:14:9
73: tokio::runtime::task::core::CoreStage<T>::poll
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.14.0/src/runtime/task/core.rs:151:13
74: tokio::runtime::task::harness::poll_future::{{closure}}
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.14.0/src/runtime/task/harness.rs:461:19
75: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
at /rustc/65c55bf931a55e6b1e5ed14ad8623814a7386424/library/core/src/panic/unwind_safe.rs:271:9
76: std::panicking::try::do_call
at /rustc/65c55bf931a55e6b1e5ed14ad8623814a7386424/library/std/src/panicking.rs:406:40
77: <unknown>
78: std::panicking::try
at /rustc/65c55bf931a55e6b1e5ed14ad8623814a7386424/library/std/src/panicking.rs:370:19
79: std::panic::catch_unwind
at /rustc/65c55bf931a55e6b1e5ed14ad8623814a7386424/library/std/src/panic.rs:133:14
80: tokio::runtime::task::harness::poll_future
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.14.0/src/runtime/task/harness.rs:449:18
81: tokio::runtime::task::harness::Harness<T,S>::poll_inner
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.14.0/src/runtime/task/harness.rs:98:27
82: tokio::runtime::task::harness::Harness<T,S>::poll
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.14.0/src/runtime/task/harness.rs:53:15
83: tokio::runtime::task::raw::poll
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.14.0/src/runtime/task/raw.rs:113:5
84: tokio::runtime::task::raw::RawTask::poll
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.14.0/src/runtime/task/raw.rs:70:18
85: tokio::runtime::task::UnownedTask<S>::run
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.14.0/src/runtime/task/mod.rs:379:9
86: tokio::runtime::blocking::pool::Inner::run
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.14.0/src/runtime/blocking/pool.rs:264:17
87: tokio::runtime::blocking::pool::Spawner::spawn_thread::{{closure}}
at /Users/aionescu/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.14.0/src/runtime/blocking/pool.rs:244:17
I did run some more tests and ultimately I found the issue: two of the rows in the parquet file contains the 9999-12-31 02:00:00
in the dimension_load_date
column.
This is supported by Parquet and Spark.
Here is the content of the parquet file:
+------------+------------------+------------------+-------------------+
|licence_code|vehicle_make |fuel_type |dimension_load_date|
+------------+------------------+------------------+-------------------+
|odc-odbl |**Not Provided** |**Not Provided** |9999-12-31 02:00:00|
|odc-odbl |**Not Applicable**|**Not Applicable**|9998-12-31 02:00:00|
|odc-odbl |SAVIEM |Petrol |2021-06-09 03:02:37|
|odc-odbl |YAMAHA |Petrol |2021-06-09 03:43:47|
|odc-odbl |VAUXHALL |Petrol |2020-10-18 03:23:47|
|odc-odbl |VAUXHALL |Petrol |2021-06-09 03:02:37|
|odc-odbl |BMW |Petrol |2021-06-09 03:38:39|
|odc-odbl |MG |Petrol |2020-10-18 03:23:47|
|odc-odbl |PEUGEOT |Diesel |2020-10-18 03:35:16|
|odc-odbl |FORD |Diesel |2020-10-18 03:23:47|
|odc-odbl |FORD |Petrol |2020-10-18 03:12:55|
|odc-odbl |SKODA |Diesel |2021-06-09 03:02:37|
|odc-odbl |SHOGUN |Diesel |2020-10-18 03:12:55|
|odc-odbl |MITSUBISHI |Diesel |2021-06-10 01:15:47|
+------------+------------------+------------------+-------------------+
Should I close this ticket since I opened #1360 ticket for a better specificity?
Closing issue, I created a more specific one here: #1360.
Describe the bug Reading Parquet file with
int96
results in panic with the following error:To Reproduce Steps to reproduce the behavior:
data-dimension-vehicle-20210609T222533Z-4cols-14rows.parquet
file.cargo new read-parquet
, create adata
folder in your project and put the parquet file in thedata
folder inside your project.Cargo.toml
file to contain the following:[dependencies] tokio = "1.14" arrow = "6.0" datafusion = "6.0"
cargo run
.I've tried the following combinations but I got the same error:
CAST
to timestamp -- error 🔴dimension_load_date
-- success 🟢Expected behavior To be able to read that parquet file. The parquet file can be read with
parquet-tools
CLI and Apache Spark.Additional context OS:
macOS 12.0.1
(Monterey) Rust:rustc 1.58.0-nightly (65c55bf93 2021-11-23)
Cargo:cargo 1.58.0-nightly (e1fb17631 2021-11-22)
I transformed the parquet file into CSV and everything worked as expected.