duckdb / duckdb-rs

Ergonomic bindings to duckdb for Rust
MIT License
474 stars 100 forks source link

Leading x00 byte when querying BIT value using arrow #349

Open pshampanier opened 3 months ago

pshampanier commented 3 months ago

When I query the following BIT value '11011110101011011011111011101111' using arrow, I'm getting an extra leading byte with the value x00.

'11011110101011011011111011101111' is 32 bits and expected to be (\xDE\xAD\xBE\xEF) shown by the following query run from the duckdb CLI:

SELECT '11011110101011011011111011101111'::BIT::BLOB;
┌───────────────────────────────────────────────────────────────┐
│ CAST(CAST('11011110101011011011111011101111' AS BIT) AS BLOB) │
│                             blob                              │
├───────────────────────────────────────────────────────────────┤
│ \xDE\xAD\xBE\xEF                                              │
└───────────────────────────────────────────────────────────────┘

But when running the following code I'm getting a result of 5 bytes:

use duckdb::{params, Connection};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let conn = Connection::open_in_memory()?;
    let mut stmt = conn.prepare("SELECT '11011110101011011011111011101111'::BIT")?;
    let result = stmt.query_arrow(params![])?;
    println!("{:?}", result.collect::<Vec<_>>());
    Ok(())
}
[RecordBatch { schema: Schema { fields: [Field { name: "CAST('11011110101011011011111011101111' AS BIT)", data_type: Binary, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }], metadata: {} }, columns: [BinaryArray
[
  [0, 222, 173, 190, 239],
]], row_count: 1 }]

The expected result is [222, 173, 190, 239].

[!NOTE] rustc 1.75.0 (82e1608df 2023-12-21) duckdb = "0.10.2"

pshampanier commented 1 month ago

Issue still there with duckdb = "1.0.0".

[!NOTE] rustc 1.75.0 (82e1608df 2023-12-21) binary: rustc commit-hash: 82e1608dfa6e0b5569232559e3d385fea5a93112 commit-date: 2023-12-21 host: x86_64-apple-darwin release: 1.75.0 LLVM version: 17.0.6