apache / airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
https://airflow.apache.org/
Apache License 2.0
35.26k stars 13.78k forks source link

Hashicorp Vault: VAULT_CAPATH & VAULT_CACERT broken by pre-creating session to pass along #37611

Open Blizzke opened 4 months ago

Blizzke commented 4 months ago

Apache Airflow version

2.8.1

If "Other Airflow 2 version" selected, which one?

No response

What happened?

When specifying a VAULT_CAPATH for self signed certificates, they are correctly loaded by the HCP vault client, but because the AF internal client pre-creates a session and passes that along, the adapter throws that value away in favor of the one from the session.

Since the internal client does not read those environment settings, and does nothing to "correctly" configure the session.verify, it is impossible to specify a certificate / a path to certificates to the vault client

What you think should happen instead?

Being able to control the verify behavior.

How to reproduce

Use a self signed certificate for your vault and try to specify it using the environment variables

Operating System

arch

Versions of Apache Airflow Providers

apache-airflow-providers-hashicorp==3.6.3

Deployment

Docker-Compose

Deployment details

No response

Anything else?

No response

Are you willing to submit PR?

Code of Conduct

Blizzke commented 4 months ago

Sorry, this should've been a provider bug. Mea culpa.

eladkal commented 4 months ago

@Blizzke Is this report related to https://github.com/apache/airflow/issues/37619 ?

Blizzke commented 4 months ago

Not sure I get what you mean. I just encountered this problem first while I was trying to connect airflow to our vault (with self signed certs). I encountered #37619 after I managed to work around this issue. So they're related in a sense that they're problems with the same provider, but they don't have anything in common otherwise...

eladkal commented 4 months ago

@tungbq maybe you can look into this issue ?

tungbq commented 4 months ago

@tungbq maybe you can look into this issue ?

Sure, I will take a look

tungbq commented 3 months ago

Hi @Blizzke thanks for catching and opening the issue. Could you please provide the detailed script/function you are using when specifying a VAULT_CAPATH and the error log you are facing? It would help me understand/debug the issue better. Thanks!

evgeniikozlov commented 1 month ago

I found related issue. I don't know if I need to create another issue, please let me know if it is required.

Apache Airflow version

2.9.1

What happened?

We use Hashicorp Vault as secrets backend and pass certificate via paramter verify, like: { "AIRFLOWSECRETSBACKEND_KWARGS": { "verify": "cert_path" } Starting from version apache-airflow-providers-hashicorp==3.4.2 usage of this parameter is broken, is is not used, actually. In airflow\providers\hashicorp_internal_client\vault_client.py, line 207, session is created, but parameter "verify" is not passed inside. It is still passed to hvac.Client via kwargs (line 212), but inside hvac adapter "verify" is filled backwards from session object (hvac.adapters.py, line 97), so the original value is missed.

I believe the original problem of @Blizzke is similar. Although, VAULT_CAPATH is variable of hvac cleint, and probably original problem can be fixed in hvac repo, prioritizing argument "verify" to session.verify.

What you think should happen instead?

Parameter "verify" is correctly used, via passing it to Session constructor. Or, session is not created in vault_client.py (created in hvac client).

How to reproduce

Pass certificate with keyword parameter "verify" to VaultHook constructor.

Versions of Apache Airflow Providers

apache-airflow-providers-hashicorp==3.6.4

Code of Conduct

eladkal commented 3 weeks ago

@tungbq are you still working on this issue?