sfu-db / connector-x

Fastest library to load data from DB to DataFrames in Rust and Python
https://sfu-db.github.io/connector-x
MIT License
1.94k stars 152 forks source link

Update Tiberius to Latest Version (Fix Conversion Error) #678

Open VictorLemosR opened 3 weeks ago

VictorLemosR commented 3 weeks ago

What language are you using?

Rust

What version are you using?

0.3.3

What database are you using?

MSSQL

What dataframe are you using?

Arrow

Can you describe your bug?

There is a conversion error occurring on the tiberius side. Specifically, it returns an Err when attempting to convert f64(12.945). Fortunately, this issue has been fixed in newer versions of tiberius. Currently, Connector-X uses version 0.5.16 (from July 26, 2021), while the latest version is 0.12.3 (as of July 19, 2024).

Database setup if the error only happens on specific data or data type

I haven't extensively tested with different database setups, but the error occurs when attempting to convert f64(Some(12.945)).

Example query / code

My query is quite simple: select a, b from X..Y where a in (42,45,50,85,151,160) and b >= '010124' and b < '300824'

fn build_sql_connector() -> SourceConn {
    let uri = format!("mssql://{username}:{password}@{server}:/{database}");
    SourceConn::try_from(uri.as_str()).expect("Uri read failed")
}

fn do_queries(
    queries: Vec<String>,
    sql_connection: SourceConn,
) -> Result<Vec<RecordBatch>, ArrowDestinationError> {
    let mut converted_queries = Vec::new();
    for query in queries {
        converted_queries.push(CXQuery::from(&query));
    }
    let destination =
        get_arrow(&sql_connection, None, &converted_queries).expect("Failed to do the query");

    destination.arrow()
}

fn convert_batch_to_dataframe(arrow_array: Vec<RecordBatch>) -> Vec<DataFrame> {
    let mut dataframe: Vec<DataFrame> = Vec::new();
    for batch in arrow_array.into_iter() {
        dataframe.push(record_batch_to_dataframe(&batch).unwrap_or_else(|error| {
            panic!("Error when converting from arrow to polars: {}", error);
        }));
    }

    dataframe
}

fn record_batch_to_dataframe(batch: &RecordBatch) -> Result<DataFrame, PolarsError> {
    let schema = batch.schema();
    let mut columns = Vec::with_capacity(batch.num_columns());
    for (i, column) in batch.columns().iter().enumerate() {
        let polars_array = Box::<dyn polars_array>::from(&**column);

        columns.push(Series::from_arrow(
            schema.fields().get(i).unwrap().name(),
            polars_array,
        )?);
    }
    Ok(DataFrame::from_iter(columns))
}

What is the error?

thread '' panicked at C:\Users\victor.reial.cargo\registry\src\index.crates.io-6f17d22bba15001f\tiberius-0.5.16\src\row.rs:370:27: called Result::unwrap() on an Err value: Conversion("cannot interpret F64(Some(12.945)) as an Decimal value")

Niivii commented 2 weeks ago

This would also fix issues with SQL Server pre login token not supported in 0.5 that were fixed in 0.9