Open DimedS opened 2 months ago
@lasica @marrrcin Any thoughts? Are you accepting PRs on getindata/kedro-airflow-k8s?
You can use the official one and run on k8s. See https://getindata.com/blog/deploying-kedro-pipelines-gcp-composer-airflow-node-grouping-mlflow/
As I understand:
If I have a Kubernetes Cluster, I can deploy Airflow there using Helm and customise the deployment with a values.yaml
file and a custom Docker image to run my Kedro project's DAG. The process involves:
So technically, I don't need anything special to run Kedro on Airflow deployed on a Kubernetes cluster; it's enough to use a DAG created by the kedro-airflow
plugin. However, this setup only allows me to run one Kedro project per Airflow deployment. If I want to run multiple projects in the same Airflow deployment, I can use the KubernetesPodOperator()
for each Airflow task (i.e., Kedro node). This will execute each task in an isolated, customised container in a separate Kubernetes Pod, with the KubernetesExecutor
dynamically managing all these pods.
However, this approach might be inefficient if there are many Kedro nodes, as it will require deploying many containers. It's better to group nodes to reduce the number of tasks, and thus the number of pods.If I understood correctly, additional functionality in the kedro-airflow
plugin to help modify your DAG by inserting the KubernetesPodOperator()
and KubernetesExecutor
parts would be beneficial.
Do you have the same opinion, @marrrcin? Is using the KubernetesPodOperator()
for each task a good solution?
Hi,
so the solution I've linked above (https://getindata.com/blog/deploying-kedro-pipelines-gcp-composer-airflow-node-grouping-mlflow/) does exactly that - it either runs N:N \
Description
To facilitate running Kedro Airflow on Kubernetes, the kedro-airflow-k8s plugin was developed. However, it only supports versions of Kedro up to 0.18.0, while the current version is 0.19.4. Consequently, we have moved the recommendation to use this plugin to the end of our airflow deployment documentation. We now need to determine the best approach for using Kedro Airflow on Kubernetes going forward.