apache / airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
https://airflow.apache.org/
Apache License 2.0
36.43k stars 14.11k forks source link

Improve guide for KubernetesPodOperator #8970

Open mik-laj opened 4 years ago

mik-laj commented 4 years ago

Hello,

A better guide that describes how to use KubernetesPodOperator would be useful. This is one of the most frequently used Airflow operators, so many users have problems with this operator. https://airflow.readthedocs.io/en/latest/howto/operator/kubernetes.html We currently have a guide that does not describe all the basic concepts. It is not very useful for the end user.

It would be fantastic if the following topics were discussed.

If anyone is interested in this task, I am willing to provide all the necessary tips and information. If you are interested in writing a guide, you do not have to describe all the sections. Even one section will be very helpful and easier to learn about Airflow.

If anyone wants to learn more about this operator, I invite you to read the following articles. https://cloud.google.com/composer/docs/how-to/using/using-kubernetes-pod-operator https://www.astronomer.io/docs/kubepodoperator/ https://www.astronomer.io/docs/cli-kubepodoperator/ https://medium.com/bluecore-engineering/were-all-using-airflow-wrong-and-how-to-fix-it-a56f14cb0753

Are you wondering how to start contributing to this project? Start by reading our contributor guide

tanjinP commented 4 years ago

I can take this on as it relates to #8883

mik-laj commented 4 years ago

@tanjinP Which section would you like to focus on at the beginning?

How does KubernetesPodOperator work? Send request via Kubernetes API and waiting for the end. Airlfow uses labels to identity pod. How to define env/configmap/image pull secrets? How to use KubernetesPodOperaotr with YAML file/JSON Spec?

What do you think about these 3 sections?

tanjinP commented 4 years ago

@mik-laj I think we should include these as well - especially when we are using Kubernetes for the first time and if we want to pass around some data in the DAG context.

How to use KubernetesPodOperaotrs with Private Docker Image Registry? How does XCOM work and how to use it?

mik-laj commented 4 years ago

@tanjinP Fantastic! I am waiting for your contribution.

Siddharthk commented 4 years ago

@tanjinP @mik-laj I was not able to find documentation on how to run airflow 1.10.10 on kubernetes. Can you guys help me point to that if yaml/helm charts already exists?

mik-laj commented 4 years ago

@Siddharthk https://airflow.readthedocs.io/en/latest/kubernetes.html We have very limited documentation for Kubernetes. But it's worth you to look at https://github.com/apache/airflow/pull/8777 https://github.com/apache/airflow-on-k8s-operator We have also many good articles on the awesome list: https://github.com/jghoman/awesome-apache-airflow

Siddharthk commented 4 years ago

@mik-laj thanks for the information. Eagerly waiting to try out the helm chart.

Dr-Denzy commented 3 years ago

I will gladly take this on @kaxil

SoniaComp commented 2 years ago

@mik-laj Hi! I cannot access this link: https://airflow.readthedocs.io/en/latest/howto/operator/kubernetes.html Does this issue still work?

mik-laj commented 2 years ago

Here is correct link for latest released version: https://airflow.apache.org/docs/apache-airflow-providers-cncf-kubernetes/stable/operators.html