sfu-db / connector-x

Fastest library to load data from DB to DataFrames in Rust and Python
https://sfu-db.github.io/connector-x
MIT License
1.94k stars 151 forks source link

Update Connection Pooling Crate #107

Closed wseaton closed 2 years ago

wseaton commented 3 years ago

It would be nice if we could get logs from rust over the wire for debug purposes, preferably configurable from the Python client.

I have a supposedly posgtres compatible source that fails due to

RuntimeError: Cannot get metadata for the queries, last error: Some(Error { kind: UnexpectedMessage, cause: None })

In the meantime, any tips for debugging this?

dovahcrow commented 3 years ago

Indeed you can set the environment variable RUST_LOG=connectorx=debug to see the debug information, even in Python (do it before import connectorx).

If you can share with us what database are you using, we can also take a look into it.

Since it says UnexpectedMessage I guess it is a compatibility issue.

wseaton commented 3 years ago

@dovahcrow thanks for the tip, that helped narrow down the issue to a particular metadata query. Digging through the code I'm not sure exactly where the compatibility issue lies, I have standalone code that can query the database fine.

    let mut builder = SslConnector::builder(SslMethod::tls_client()).unwrap();
    builder.set_ca_file(cert_location).unwrap();

    let mut connector = MakeTlsConnector::new(builder.build());
    connector.set_callback(|connect, _| {
        connect.set_verify_hostname(false);
        Ok(())
    });

    let conn_string = format!(
        "host={host} user={user} password={password} port={port} dbname={dbname} sslmode=require",
        host = &host,
        port = &port,
        user = &username,
        password = &password,
        dbname = &dbname
    );

    let mut client = Client::connect(&conn_string, connector).unwrap();
    let res = client.query("select 1 limit 1;", &[]).unwrap();

    println!("{:?}", &res);

My database definitely doesn't support binary or csv protocols, but from what I can tell if the above code works then cursor should be supported. I'll keep digging.

wseaton commented 3 years ago

The specific database I'm trying to integrate w/ is Teiid, which purports to expose a PostgreSQL compatible bridge.

What Teiid provides is actually an interface that emulates the PostgreSQL Database. It has been vetted to work with PostgreSQL ODBC (and JDBC drivers) along with access from several platforms including Node.js. SQLAlchemy, and tools such as QGIS.

sqlalchemy's psycopg2 driver definitely works, which makes me think it may be something in the connection pooling layer it doesn't like rather than the protocol itself.

wseaton commented 3 years ago

Okay yep, in my case it is definitely the connection pooling:

let config: Config = conn_string.parse()?;
let manager = PostgresConnectionManager::new(config, connector);
let pool = Pool::builder().max_size(1 as u32).build(manager)?;

let mut conn = pool.get().unwrap();
let res = conn.query("select 1 limit 1;", &[]).unwrap();
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Error(Some("unexpected message from server"))', src/main.rs:47:31

This bare minimum example w/ pooling enabled throws the same error I was experiencing above.

@dovahcrow are there any plans to make connection pooling an opt-in feature?

dovahcrow commented 3 years ago

@wseaton Thanks for the analysis! I'll take a look into the pooling to see which command triggers the issue. Using the pool is just for convenience purposes. We will definitely remove it in a future release.

dovahcrow commented 3 years ago

@wseaton By drilling down a little bit, I found the Postgres pool will execute an empty command to test if the connection is valid https://github.com/sfackler/r2d2-postgres/blob/master/src/lib.rs#L71. I guess this simple_query("") is triggering the unexpected message from server error. May I ask if you can help me confirm this? If so, an easy solution will be ask the upstream to change the simple_query("") to simple_query("SELECT 1").

wseaton commented 3 years ago

@dovahcrow hmm, looks like the compatibility issue is deep in tokio's simple query protocol, I still get unexpected message when passing in a SELECT 1; instead of an empty query.

dovahcrow commented 3 years ago

I see. Let me first fork the r2d2-postgres library and change the simple_query to a normal query to see if it fixes the problem. Maybe Teiid does not support this one https://www.postgresql.org/docs/10/protocol-flow.html#id-1.10.5.7.4.

wseaton commented 3 years ago

@dovahcrow coming back to this I was able to figure it out, it's not simple_query that Teiid doesn't support, it's actually passing in a blank query that is only whitespace that makes Teiid upset.

Changing the connection validation function to the following fixes it:

    fn is_valid(&self, client: &mut Client) -> Result<(), Error> {
        client.simple_query("SELECT 1;").map(|_| ())
    }

I'll open a ticket upstream there to see if we can get it changed, although I think Teiid is violating the spec here.

dovahcrow commented 3 years ago

@wseaton Thanks for opening the issue to the upstream! Let's wait for a little bit to see if r2d2-postgres is going to fix it. Since we are also thinking to let the user reuse the connections across different read_sql, we might need to keep the connection pooling.

dovahcrow commented 3 years ago

Good news, @wseaton r2d2-postgres changed to SELECT 1. Now we wait for them to publish a version.

wseaton commented 3 years ago

@dovahcrow we got a release: https://github.com/sfackler/r2d2-postgres/releases/tag/v0.18.1! Thanks again for all your help on this!

dovahcrow commented 3 years ago

Great! Will include this in the next release.