Closed ajazam closed 20 hours ago
I've tried rust 1.82 and 1.83
Looks like date_part
was updated to return Int32
instead of Float64
in this PR https://github.com/apache/datafusion/pull/13466 which should fix this issue. As a workaround you could try casting it like arrow_cast(EXTRACT(..), 'Int64')
I didn't implement float64 for hive partitioning because, well, floats in general are not exact values. Best to cast to an int.
Thanks gents I got it working. For anybody else who comes up against this issue I made the following alteration
let df = ctx.sql("copy (SELECT dte, ot, arrow_cast(EXTRACT(YEAR FROM dte), 'Int32') AS year from data) to './partitioned_output' stored as parquet PARTITIONED BY (year)").await?;
Describe the bug
I am trying to create a parquet file with hive partitioning, from csv data and get error
Error: External(NotImplemented("it is not yet supported to write to hive partitions with datatype Float64"))
To Reproduce
main.rs use std::fs::File; use std::io::Write; use arrow::datatypes::{DataType, Field, Schema}; use datafusion::prelude::*; use tempfile::tempdir;
[tokio::main]
async fn main() -> datafusion::error::Result<()> { let dir = tempdir()?; let file_path = dir.path().join("example.csv");
2016-07-01 00:00:00,2 2016-07-01 06:45:00,3"# .as_bytes())?;
}
cargo.toml [package] name = "datafusion_csv" version = "0.1.0" edition = "2021"
[dependencies] tokio = { version = "1", features = ["full"] } datafusion = "43.0.0" arrow = "53.3.0" tempfile = "3.14.0"
Expected behavior
I am expecting a folder year=2016 containing a parquet file
Additional context
I was original trying to have folders for month and day, couldn't get the application to work and then created this simpler example.