duckdb / duckdb_mysql

https://duckdb.org/docs/extensions/mysql
MIT License
55 stars 13 forks source link

Unable to attach when a connection parameter has whitespace or equal sign #42

Closed danilobellini closed 8 months ago

danilobellini commented 8 months ago

What happens?

In mysql_utils.cpp, the ParseConnectionParameters evaluates the string from ATTACH 'k1=v1 k2=v2' AS name (TYPE mysql). These key-value pairs are parsed literally using the whitespace and = symbol:

// ...
auto parameters = StringUtil::Split(dsn, ' ');
for (auto &param : parameters) {
    StringUtil::Trim(param);
    if (param.empty()) {
        continue;
    }
    auto splits = StringUtil::Split(param, '=');
    if (splits.size() != 2) {
        throw InvalidInputException("Invalid dsn \"%s\" - expected key=value pairs separated by spaces", dsn);
    }
    auto key = StringUtil::Lower(splits[0]);
    auto &value = splits[1];
    // ..
}
// ...

The Trim call means leading and trailing whitespace are discarded, and the check for split result size forces the = sign not to appear in value. This means the value cannot include the whitespace character, TABs are only discarded if they are trailing characters of a given value, and values can't include the equal sign = character. This is limiting when it comes to username and password. I'm also not sure of all that can happen with non-ASCII characters.

In my case, I'd like to attach to MySQL while using a random generated password that can possibly include whitespace and equal sign, but in this case I found no way to do so because there's no escape rule that would allow using these characters.

To Reproduce

I ran MySQL locally and created, with SQLAlchemy, a user with these characters:

username = "a ç"  # Whitespace and non-ASCII
password = ":@ tesT'=\t-\"=\n-&\t"  # ASCII random password w/ tab and newline
with engine.begin() as conn:
    conn.execute(
        sa.text(
            "CREATE USER :username IDENTIFIED BY :password"
        ).bindparams(
            usename=username,
            password=password,
        )
    )

I was able to connect to MySQL using these credentials, using URI encoding (from urllib.parse.quote in Python) to connect to it in another SQLAlchemy engine, where the user became a%20%C3%A7 (stating this avoids a possible ambiguity with the encoding for ç).

Attempting to use URI encoding in DuckDB raise an IOException. When removing the whitespace in the username, it attempts to connect with a alone ignoring the ç in the Access denied for user message, but this might be just a visualization detail (if ç isn't the last character, it appears in the message). Quotes make no difference for the parameter splitting, nor backslashes.

I might be missing something, but I found no other way to create a MySQLConnectionParameters from Python to use with DuckDB for an ATTACH statement when the only credentials I have are like the ones in the example above.

OS:

Linux

MySQL Version:

8.0.34

DuckDB Version:

0.10.0

DuckDB Client:

Python

Full Name:

Danilo de Jesus da Silva Bellini

Affiliation:

HexD Tech

Have you tried this on the latest main branch?

Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?