apache / airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
https://airflow.apache.org/
Apache License 2.0
37.1k stars 14.29k forks source link

log aggregation from kubernetesExecutor using kubernetesAPI #17678

Open ArshiAAkhavan opened 3 years ago

ArshiAAkhavan commented 3 years ago

Description there are three main ways for storing logs in airflow in k8s including shared persistVolume, s3 object storage, and elasticsearch putting shared volume aside, the elasticsearch way haven't been implemented nicely when using kubernetesExecutor as your executor in airflow

the current flow is something like this :

the solution above can be improved because as of now

although both of the solutions work, there is still a waste of resources in both solutions

meanwhile we have the kubernetesPodOperator that does the fascinating job of retrieving logs from the task-pods stdout via kubernetes APIs by default!

i was thinking that combining these two features (elasticsearch.write_stdout=true and kubernetesPodOperator default behavior) we are able to send logs from worker pods to the scheduler directly and have them stored in the scheduler pod instead

Use case / motivation well first of all in case you are deploying your scheduler and webserver service outside of your k8s cluster, thats the end of the road for you since you have your logs stored in a disk both visible from web-server and scheduler

if you are deploying your scheduler and webserver on k8s (which is a common practice) then you still need the logstash/filebeat service to send logs to your elasticsearch instance but this time you wont be needing a whole deamonset or one instance per worker pod , just one per each scheduler pod would suffice which is much less recourse usage (in my case i have only one scheduler pod so its only 1!)

What do you want to happen? the whole process of remote logging to elasticsearch is so hard compare to other parts of deploying airflow when using kubernetesExecutor and i am trying to ease up the process

also i feel like its more k8s-ish way to do!!

Are you willing to submit a PR? if pointed to the right directions to look at, yes!

boring-cyborg[bot] commented 3 years ago

Thanks for opening your first issue here! Be sure to follow the issue template!