teamclairvoyant / airflow-scheduler-failover-controller

A process that runs in unison with Apache Airflow to control the Scheduler process to ensure High Availability
Apache License 2.0
232 stars 58 forks source link

Can you provide some information on the configuration of the Zookeeper? #28

Closed zhijian-pro closed 3 years ago

zhijian-pro commented 4 years ago

I found the configuration of metadata_service_zookeeper_nodes in the configuration file, but did not find any relevant help instructions. Can you explain how this configuration can be used and work? Also, I have a few questions.

  1. is the metadata_service_zookeeper_nodes configuration must required?
  2. If metadata_service_zookeeper_nodes is set, do I still need to configure scheduler_nodes_in_cluster?
  3. use systemd to manage the airflow schedule process, Whether theRestart=alwaysand theRestartSec=60configuration will affect the failover use scheduler_failover? Thanks for all the plugins you contributed about airflow!
rssanders3 commented 3 years ago

Hello @Anthony-Duan, sorry for the delay.

Thank you for your questions. I will work on updating the documentation to include the answer to these questions, but here they are for your purposes:

Is the metadata_service_zookeeper_nodes configuration must required? It is only required if metadata_service_type == ZookeeperMetadataService. Else you can leave it as the default.

If metadata_service_zookeeper_nodes is set, do I still need to configure scheduler_nodes_in_cluster? Yes you would still need to set the scheduler_nodes_in_cluster config as this is the list that the Scheduler Failover Conroller uses to determine which hosts are acting as Schedulers. These details are not included in Zookeeper.

use systemd to manage the airflow schedule process, Whether the Restart=always and the RestartSec=60configuration will affect the failover use scheduler_failover? This may affect the Scheduler Failover controller as the controller might poll during a time in which the Scheduler is restarting. It is also recommended to diable the automatic restart of the Scheduler process in the SystemD file. So remove the Retry and RestartSec section in the default SystemD file.