apache / airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
https://airflow.apache.org/
Apache License 2.0
37.12k stars 14.31k forks source link

MsSql hook does not use extra parameters when creating PymssqlConnection #43798

Open ilarionkuleshov opened 1 week ago

ilarionkuleshov commented 1 week ago

Apache Airflow Provider(s)

microsoft-mssql

Versions of Apache Airflow Providers

apache-airflow-providers-microsoft-mssql==3.9.1

Apache Airflow version

2.9.2

Operating System

Ubuntu 22.04

Deployment

Docker-Compose

Deployment details

No response

What happened

I found that MsSqlHook in get_conn method does not use extra parameters from airflow connection when creating PymssqlConnection. Thus, it is impossible to pass any special argument from the extra parameters of the airflow connection to pymssql.connect function. I discovered this when I needed to pass the tds_version argument, because without it the connection to my database did not occur successfully. Also, the documentation says that extra parameters can be used in MSSQL connection, but in fact this is not the case.

What you think should happen instead

I think the get_conn method should look something like this:

def get_conn(self) -> PymssqlConnection:
    """Return ``pymssql`` connection object."""
    conn = self.connection
    extra_conn_args = {arg_name: arg_val for arg_name, arg_val in conn.extra_dejson.items() if arg_name != "sqlalchemy_scheme"}  # new code line
    return pymssql.connect(
        server=conn.host,
        user=conn.login,
        password=conn.password,
        database=self.schema or conn.schema,
        port=str(conn.port),
        **extra_conn_args,  # new code line
    )

I tested it locally, and with this code the extra parameters are successfully passed to PymssqlConnection. And in my case the connection to the DB was successful.

I would also like to note that I added an excluding for the sqlalchemy_scheme parameter, because in the hook it is used in the sqlalchemy_scheme property, and in the pymssql.connect function it is not needed.

How to reproduce

To reproduce the bug, you should try to add any extra parameter to the airflow connection. And this will not affect the connection to the database, because the hook does not use extra parameters to create a connection to the database.

Anything else

I could probably create a PR with a fix, but since I'm new here, I assume that there might be a catch, and maybe it was intended that extra parameters not be passed to the pymssql connection (if so I wonder why). So, for now, I decided to just create an issue.

Are you willing to submit PR?

Code of Conduct

boring-cyborg[bot] commented 1 week ago

Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.