GoogleCloudPlatform / airflow-operator

Kubernetes custom controller and CRDs to managing Airflow
Apache License 2.0
299 stars 68 forks source link

git-sync on airflow does not support SSH authentication #18

Open raam86 opened 5 years ago

raam86 commented 5 years ago

The current object only supports the username + password authentication method: https://github.com/GoogleCloudPlatform/airflow-operator/blob/1fbc02dbbaefd70b43f1db5b4a51e3d5ab6cd537/pkg/apis/airflow/v1alpha1/utils.go#L307

in order to fully support git-sync we need to 1. set GIT_SYNC_SSH to true and map a mounted volume that points to the ConfigMap: https://github.com/kubernetes/git-sync/blob/9ceb61f7947fbe463b1cc6e9ae5d719f5d8eebd2/docs/ssh.md#step-3-configure-git-sync-container

barney-s commented 5 years ago

Would you have time to take this up ? @raam86 I can help you with the setup and tests

raam86 commented 5 years ago

After some more research I opted for GCS sidecar container, I think it's the best of all worlds. Basically lifted your setup into the helm/charts/incubator

barney-s commented 5 years ago

Thanks. What did you mean by lifted you setup into helm/charts ?

If you are using both i would love to hear you feedback on helm/charts vs the operator.

raam86 commented 5 years ago

I initially used the operator and got quite excited but was stuck when trying to customize it. I specifically wanted to have the gcs sync as a sidecar instead of an init container and it was easier to do using the helm chart. Since I realized I am going to do work anyway I wanted it to be less google specific so I opted for the helm chart. the helm chart also has more users, more support and is more standard

barney-s commented 5 years ago

Ah i see. GCS sync is a side-car. The default value of .spec.dags.gcs.once is false. Which helm chart are you using ?

raam86 commented 5 years ago

The one that just became stable. You are definitely right about the defaults. I think that what actually happened was that I tried using git sync realized I can’t opted for the helm chart and eventually opted for gcs. Sorry about the confusion. But the underlying reason for continuing using the helm chart instead of the operator is that the helm chart uses standard and available kubernetes concepts while the operator is minting bew ones I don’t have lots of incentive to learn

dmateusp commented 5 years ago

hi there, since this is the first Google result when looking for SSH auth for git-sync on Airflow just wanted to let you know I've been working on adding this to the Airflow project:

https://issues.apache.org/jira/browse/AIRFLOW-3918 https://github.com/apache/airflow/pull/4777