apache / airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
https://airflow.apache.org/
Apache License 2.0
36.85k stars 14.25k forks source link

JDBC connect string from connection uri parsed wrongly #11710

Closed humbledude closed 2 years ago

humbledude commented 4 years ago

Apache Airflow version: 1.10.12

Kubernetes version (if you are using kubernetes) (use kubectl version): 1.15.11

Environment:

What happened: I was testing Hashcorp Vault SecretBackend + SqoopOperator

And I found if I use JDBC connect string value on uri format that suggested by Airflow, JDBC connect string parsed wrongly.

JDBC connect string example: jdbc:postgresql://username:password@host:port?param1=value1&param2=value2

I found it is caused by urllib.parse.urlparse and airflow.models.connection.Connection uses the code here

If I use alternative secret backend, it seems using uri format that suggested by Airflow is the only choice.

I think Airflow should provide another option for support JDBC connect string for alternative secret backends.

Thank you.

What you expected to happen:

How to reproduce it:

  1. setup VaultSecret backend
  2. make a connection with JDBC connection string
  3. run a SqoopOperator task with the connection

Anything else we need to know:

boring-cyborg[bot] commented 4 years ago

Thanks for opening your first issue here! Be sure to follow the issue template!

dstandish commented 2 years ago

Hi @humbledude

I think you may be confusing the two types of URIs here.

The airflow connection uri is a mechanism of serializing airlfow's Connection object to a string.

You can't just take a jdbc URI and expect it to parse as an airflow connection. Airflow hooks parse airflow Connection objects in many different ways and you have to understand what the hook is looking for in order build the connection in the right way.

First you should replace jdbc:postgres with sqoop in your connection URI. this change alone will probably make it parse.

But next you need to debug your URI to make sure it is interpreted by the hook in the right way.

Looking at the sqoop hook, I see there is a _prepare_command method that would be useful for debugging the command.

You can call this method and examine the output. If necessary, change your uri.

You can use Connection.get_uri to produce a URI given an airflow Connection object.

Closing for now, let us know if you continue to have trouble.