airbnb / synapse

A transparent service discovery framework for connecting an SOA
MIT License
2.07k stars 251 forks source link

Validate HAProxy config prior to writing config file #305

Closed panchr closed 4 years ago

panchr commented 4 years ago

This PR adds the following:

Because of state file caching, it is still possible that certain bad state will remain in Synapse (in-memory and in cache file). However, that state will not propagate to HAProxy; HAProxy will remain with the last good state.

Tests

1. Local Testing of bad haproxy config in ZK

Synapse does not write bad HAProxy config:

I, [2020-01-16T15:58:57.760357 #79492]  INFO -- Synapse::ServiceWatcher::ZookeeperWatcher: synapse: no config_for_generator data from mango-test for service mango-test; keep existing config_for_generator
I, [2020-01-16T15:58:58.237389 #79492]  INFO -- Synapse::Synapse: synapse: configuring haproxy
I, [2020-01-16T15:58:58.238499 #79492]  INFO -- Synapse::ConfigGenerator::Haproxy: @watcher config: {"mango-test"=>{"frontend"=>["mode http"], "backend"=>["mode http", "option httpchk /health", "http-check expect string OK"]}}
I, [2020-01-16T15:58:58.238539 #79492]  INFO -- Synapse::ConfigGenerator::Haproxy: @frontends_cache: {"mango-test"=>["\nfrontend mango-test", ["\tmode http"], "\tbind localhost:3213 ", "\tdefault_backend mango-test"]}
I, [2020-01-16T15:58:58.238569 #79492]  INFO -- Synapse::ConfigGenerator::Haproxy: @backends_cache: {"mango-test"=>["\nbackend mango-test", ["\tmode http", "\toption httpchk /health", "\thttp-check expect string OK"], ["\tserver i-{HOST1} {IP1} cookie i-{HOST1} check inter 2s rise 3 fall 2 id 1", "\tserver i-{HOST2} {IP2} cookie i-{HOST2} check inter 2s rise 3 fall 2 id 1"]]}
I, [2020-01-16T15:58:58.238619 #79492]  INFO -- Synapse::ConfigGenerator::Haproxy: @watcher_revisions: {"mango-test"=>2}
E, [2020-01-16T15:58:58.245069 #79492] ERROR -- Synapse::ConfigGenerator::Haproxy: synapse: invalid generated HAProxy config (checked via haproxy -c -f /usr/local/etc/haproxy/haproxy-staging.cfg): [ALERT] 015/155858 (80168) : parsing [/usr/local/etc/haproxy/haproxy-staging.cfg:29] : 'server i-{HOST}' : 'id' : custom id 1 already used at /usr/local/etc/haproxy/haproxy-staging.cfg:28 ('server i-{HOST}')
[ALERT] 015/155858 (80168) : Error(s) found in configuration file : /usr/local/etc/haproxy/haproxy-staging.cfg
[ALERT] 015/155858 (80168) : Fatal errors found in configuration.

I, [2020-01-16T15:58:58.245358 #79492]  INFO -- Synapse::ConfigGenerator::Haproxy: synapse: checked HAProxy config located at /usr/local/etc/haproxy/haproxy-staging.cfg; status: false
I, [2020-01-16T15:58:58.245397 #79492]  INFO -- Synapse::ConfigGenerator::Haproxy: synapse: at time 21 waiting until 39 to restart

Running HAProxy config is still valid, while "staging" is invalid:

$ haproxy -c -f /usr/local/etc/haproxy/haproxy.cfg
Configuration file is valid

$ haproxy -c -f /usr/local/etc/haproxy/haproxy-staging.cfg
[ALERT] 015/160311 (88063) : parsing [/usr/local/etc/haproxy/haproxy-staging.cfg:29] : 'server i-{HOST}' : 'id' : custom id 1 already used at /usr/local/etc/haproxy/haproxy-staging.cfg:28 ('server i-{HOST}')
[ALERT] 015/160311 (88063) : Error(s) found in configuration file : /usr/local/etc/haproxy/haproxy-staging.cfg
[ALERT] 015/160311 (88063) : Fatal errors found in configuration.

2. Normal behavior: HAProxy configuration is valid (mango-test)

I, [2020-01-17T00:23:40.639575 #5450]  INFO -- Synapse::ServiceWatcher::ZookeeperWatcher: synapse: zk exists at /production/secure/services/mango-canary/services for 1 times
I, [2020-01-17T00:23:40.640891 #5450]  INFO -- Synapse::ServiceWatcher::ZookeeperWatcher: synapse: discovering backends for service mango-canary
I, [2020-01-17T00:23:40.640984 #5450]  INFO -- Synapse::ServiceWatcher::ZookeeperWatcher: synapse: zk list children at /production/secure/services/mango-canary/services for 1 times
I, [2020-01-17T00:23:40.644566 #5450]  INFO -- Synapse::ServiceWatcher::ZookeeperWatcher: synapse: discovered 2 backends for service mango-canary
I, [2020-01-17T00:23:40.644614 #5450]  INFO -- Synapse::ServiceWatcher::ZookeeperWatcher: synapse: no config_for_generator data from mango-canary for service mango-canary; keep existing config_for_generator
I, [2020-01-17T00:23:41.462955 #5450]  INFO -- Synapse::Synapse: synapse: configuring haproxy
I, [2020-01-17T00:23:41.467110 #5450]  INFO -- Synapse::ConfigGenerator::Haproxy: synapse: restart required because we have a new backend mango-canary/i-{HOST}
I, [2020-01-17T00:23:41.467710 #5450]  INFO -- Synapse::ConfigGenerator::Haproxy: synapse: reconfigured haproxy via /var/haproxy/stats1.sock
I, [2020-01-17T00:23:41.530083 #5450]  INFO -- Synapse::ConfigGenerator::Haproxy: synapse: checked HAProxy config located at /etc/haproxy/haproxy-staging.cfg; status: true
I, [2020-01-17T00:23:41.634800 #5450]  INFO -- Synapse::ConfigGenerator::Haproxy: synapse: restarted haproxy

3. Behavior with invalid haproxy config (mango-test)

I, [2020-01-17T00:31:22.976560 #5450]  INFO -- Synapse::ConfigGenerator::Haproxy: synapse: restart required because we have a new backend mango-canary/i-badconfig_randomip:2048
I, [2020-01-17T00:31:22.977113 #5450]  INFO -- Synapse::ConfigGenerator::Haproxy: synapse: reconfigured haproxy via /var/haproxy/stats1.sock
I, [2020-01-17T00:31:22.982254 #5450]  INFO -- Synapse::ConfigGenerator::Haproxy: synapse: restart required because haproxy_server_options changed for i-{HOST}
E, [2020-01-17T00:31:23.071370 #5450] ERROR -- Synapse::ConfigGenerator::Haproxy: synapse: invalid generated HAProxy config (checked via sudo haproxy -c -f /etc/haproxy/haproxy-staging.cfg): [ALERT] 016/003122 (6935) : parsing [/etc/haproxy/haproxy-staging.cfg:286] : 'server i-badconfig_randomip:2048' : invalid address: 'randomip' in 'randomip:2048'

[ALERT] 016/003122 (6935) : parsing [/etc/haproxy/haproxy-staging.cfg:288] : 'server i-{HOST}' : 'id' : custom id 1 already used at /etc/haproxy/haproxy-staging.cfg:287 ('server i-{HOST}')
[ALERT] 016/003122 (6935) : Error(s) found in configuration file : /etc/haproxy/haproxy-staging.cfg
[ALERT] 016/003123 (6935) : Fatal errors found in configuration.

I, [2020-01-17T00:31:23.071532 #5450]  INFO -- Synapse::ConfigGenerator::Haproxy: synapse: checked HAProxy config located at /etc/haproxy/haproxy-staging.cfg; status: false
I, [2020-01-17T00:31:23.177268 #5450]  INFO -- Synapse::ConfigGenerator::Haproxy: synapse: restarted haproxy

Running HAProxy config is valid because it is unchanged:

$ sudo haproxy -c -f /etc/haproxy/haproxy.cfg
Configuration file is valid

$ sudo haproxy -c -f /etc/haproxy/haproxy-staging.cfg
[ALERT] 016/003421 (7404) : parsing [/etc/haproxy/haproxy-staging.cfg:286] : 'server i-badconfig_randomip:2048' : invalid address: 'randomip' in 'randomip:2048'

[ALERT] 016/003421 (7404) : parsing [/etc/haproxy/haproxy-staging.cfg:288] : 'server i-{HOST}' : 'id' : custom id 1 already used at /etc/haproxy/haproxy-staging.cfg:287 ('server i-{HOST}')
[ALERT] 016/003421 (7404) : Error(s) found in configuration file : /etc/haproxy/haproxy-staging.cfg
[ALERT] 016/003421 (7404) : Fatal errors found in configuration.

 $ sudo service haproxy status
haproxy is running.

4. do_checks = false behavior with valid haproxy config

No check is performed:

I, [2020-01-17T01:05:25.729172 #12131]  INFO -- Synapse::ServiceWatcher::ZookeeperWatcher: synapse: zk list children at /production/secure/services/mango-canary/services for 1 times
I, [2020-01-17T01:05:25.730542 #12131]  INFO -- Synapse::ServiceWatcher::ZookeeperWatcher: synapse: discovered 1 backends for service mango-canary
I, [2020-01-17T01:05:25.730588 #12131]  INFO -- Synapse::ServiceWatcher::ZookeeperWatcher: synapse: no config_for_generator data from mango-canary for service mango-canary; keep existing config_for_generator
I, [2020-01-17T01:05:26.664936 #12131]  INFO -- Synapse::Synapse: synapse: configuring haproxy
I, [2020-01-17T01:05:26.669463 #12131]  INFO -- Synapse::ConfigGenerator::Haproxy: synapse: restart required because we added new section mango-canary
I, [2020-01-17T01:05:26.670212 #12131]  INFO -- Synapse::ConfigGenerator::Haproxy: synapse: reconfigured haproxy via /var/haproxy/stats1.sock
I, [2020-01-17T01:05:26.788794 #12131]  INFO -- Synapse::ConfigGenerator::Haproxy: synapse: restarted haproxy

5. do_checks = False behavior with invalid haproxy config

No check is performed, but HAProxy will fail to restart:

I, [2020-01-17T01:07:23.061730 #12131]  INFO -- Synapse::ServiceWatcher::ZookeeperWatcher: synapse: no config_for_generator data from mango-canary for service mango-canary; keep existing config_for_generator
I, [2020-01-17T01:07:23.884589 #12131]  INFO -- Synapse::Synapse: synapse: configuring haproxy
I, [2020-01-17T01:07:23.887874 #12131]  INFO -- Synapse::ConfigGenerator::Haproxy: synapse: restart required because we have a new backend mango-canary/i-myhost2_1.1.1.2:1026
I, [2020-01-17T01:07:23.888533 #12131]  INFO -- Synapse::ConfigGenerator::Haproxy: synapse: reconfigured haproxy via /var/haproxy/stats1.sock
[ALERT] 016/010723 (12829) : parsing [/etc/haproxy/haproxy.cfg:287] : 'server i-myhost2_1.1.1.2:1026' : 'id' : custom id 1 already used at /etc/haproxy/haproxy.cfg:286 ('server i-myhost_1.1.1.1:1025')
[ALERT] 016/010723 (12829) : Error(s) found in configuration file : /etc/haproxy/haproxy.cfg
[ALERT] 016/010723 (12829) : Fatal errors found in configuration.
E, [2020-01-17T01:07:23.959745 #12131] ERROR -- Synapse::ConfigGenerator::Haproxy: failed to reload haproxy via sudo service haproxy reload:  * Reloading haproxy haproxy
   ...fail!

And the "production" config is invalid:

 $ sudo haproxy -c -f /etc/haproxy/haproxy.cfg
[ALERT] 016/010751 (12902) : parsing [/etc/haproxy/haproxy.cfg:287] : 'server i-myhost2_1.1.1.2:1026' : 'id' : custom id 1 already used at /etc/haproxy/haproxy.cfg:286 ('server i-myhost_1.1.1.1:1025')
[ALERT] 016/010751 (12902) : Error(s) found in configuration file : /etc/haproxy/haproxy.cfg
[ALERT] 016/010751 (12902) : Fatal errors found in configuration.

Reviewers

@anson627 @austin-zhu @Jason-Jian cc: @Ramyak

anson627 commented 4 years ago

can we have a more meaningful commit summary?

panchr commented 4 years ago

@anson627

can we have a more meaningful commit summary?

How is it now?