Closed rubensyltek closed 2 years ago
DEBUG=true
in DFP, and observe if it is handling the requests when you get a basic black 503 message?These are the last lines of proxy's log after rebooting the workers
proxy_proxy.1.y0x7vuujnvw2@syswawor2 |
proxy_proxy.1.y0x7vuujnvw2@syswawor2 | backend monitoring_grafana-be3000_0
proxy_proxy.1.y0x7vuujnvw2@syswawor2 | mode http
proxy_proxy.1.y0x7vuujnvw2@syswawor2 | http-request add-header X-Forwarded-Proto https if { ssl_fc }
proxy_proxy.1.y0x7vuujnvw2@syswawor2 | log global
proxy_proxy.1.y0x7vuujnvw2@syswawor2 | server monitoring_grafana monitoring_grafana:3000
proxy_proxy.1.y0x7vuujnvw2@syswawor2 |
proxy_proxy.1.y0x7vuujnvw2@syswawor2 |
proxy_proxy.1.y0x7vuujnvw2@syswawor2 | backend mypadel-develop_mypadel-be8090_0
proxy_proxy.1.y0x7vuujnvw2@syswawor2 | mode http
proxy_proxy.1.y0x7vuujnvw2@syswawor2 | http-request add-header X-Forwarded-Proto https if { ssl_fc }
proxy_proxy.1.y0x7vuujnvw2@syswawor2 | log global
proxy_proxy.1.y0x7vuujnvw2@syswawor2 | server mypadel-develop_mypadel mypadel-develop_mypadel:8090
proxy_proxy.1.y0x7vuujnvw2@syswawor2 |
proxy_proxy.1.y0x7vuujnvw2@syswawor2 |
proxy_proxy.1.y0x7vuujnvw2@syswawor2 | backend mypadel-develop_user-be8181_0
proxy_proxy.1.y0x7vuujnvw2@syswawor2 | mode http
proxy_proxy.1.y0x7vuujnvw2@syswawor2 | http-request add-header X-Forwarded-Proto https if { ssl_fc }
proxy_proxy.1.y0x7vuujnvw2@syswawor2 | log global
proxy_proxy.1.y0x7vuujnvw2@syswawor2 | server mypadel-develop_user mypadel-develop_user:8181
proxy_proxy.1.y0x7vuujnvw2@syswawor2 |
proxy_proxy.1.y0x7vuujnvw2@syswawor2 |
proxy_proxy.1.y0x7vuujnvw2@syswawor2 | backend playtomic-develop_static-be80_0
proxy_proxy.1.y0x7vuujnvw2@syswawor2 | mode http
proxy_proxy.1.y0x7vuujnvw2@syswawor2 | http-request add-header X-Forwarded-Proto https if { ssl_fc }
proxy_proxy.1.y0x7vuujnvw2@syswawor2 | log global
proxy_proxy.1.y0x7vuujnvw2@syswawor2 | server playtomic-develop_static playtomic-develop_static:80
proxy_proxy.1.y0x7vuujnvw2@syswawor2 |
proxy_proxy.1.y0x7vuujnvw2@syswawor2 |
proxy_proxy.1.y0x7vuujnvw2@syswawor2 | backend playtomic-staging_static-be80_0
proxy_proxy.1.y0x7vuujnvw2@syswawor2 | mode http
proxy_proxy.1.y0x7vuujnvw2@syswawor2 | http-request add-header X-Forwarded-Proto https if { ssl_fc }
proxy_proxy.1.y0x7vuujnvw2@syswawor2 | log global
proxy_proxy.1.y0x7vuujnvw2@syswawor2 | server playtomic-staging_static playtomic-staging_static:80
proxy_proxy.1.y0x7vuujnvw2@syswawor2 |
proxy_proxy.1.y0x7vuujnvw2@syswawor2 |
proxy_proxy.1.y0x7vuujnvw2@syswawor2 | backend visualizer_visualizer-be8080_0
proxy_proxy.1.y0x7vuujnvw2@syswawor2 | mode http
proxy_proxy.1.y0x7vuujnvw2@syswawor2 | http-request add-header X-Forwarded-Proto https if { ssl_fc }
proxy_proxy.1.y0x7vuujnvw2@syswawor2 | log global
proxy_proxy.1.y0x7vuujnvw2@syswawor2 | server visualizer_visualizer visualizer_visualizer:8080
Not all services are in the HAProxy configuration at this point. There's no more entries in the log and all requests return 503
@rubensyltek Which docker version are you using?
Client: Version: 18.06.0-ce API version: 1.37 (downgraded from 1.38) Go version: go1.10.3 Git commit: 0ffa825 Built: Wed Jul 18 19:05:26 2018 OS/Arch: darwin/amd64 Experimental: false
Server: Engine: Version: 18.03.0-ce API version: 1.37 (minimum version 1.12) Go version: go1.9.4 Git commit: 0520e24 Built: Wed Mar 21 23:08:31 2018 OS/Arch: linux/amd64 Experimental: false
@rubensyltek Can you spin up a testing environment with servers running 18.06.1-ce
and see if the issue still appears?
Ok, I could do that but remember that if I exec into the container and restart the haproxy process without restarting the container it starts working again.
Have you been able to reproduce this issue? It's pretty straightforward to reproduce
Same problem with 18.06.1-ce
Full log:
proxy_proxy.1.azw90rjy2551@syswawor2 | 2018/09/12 09:13:08 HAPRoxy: 10.255.0.7:57588 [12/Sep/2018:09:13:07.980] services monitoring_grafana-be3000_0/monitoring_grafana 0/0/0/134/139 200 25069 - - ---- 1/1/0/1/0 0/0 "GET /?orgId=1 HTTP/1.1"
proxy_proxy.1.azw90rjy2551@syswawor2 | 2018/09/12 09:13:08 HAPRoxy: 10.255.0.11:40398 [12/Sep/2018:09:13:08.687] services monitoring_grafana-be3000_0/monitoring_grafana 0/0/0/50/51 200 1636 - - ---- 1/1/0/1/0 0/0 "GET /api/dashboards/home HTTP/1.1"
proxy_proxy.1.azw90rjy2551@syswawor2 | 2018/09/12 09:13:10 HAPRoxy: 10.255.60.195:51540 [12/Sep/2018:09:13:10.081] services monitoring_grafana-be3000_0/monitoring_grafana 0/0/0/1/1 200 3815 - - ---- 1/1/0/1/0 0/0 "GET /public/img/fav32.png HTTP/1.1"
proxy_proxy.1.azw90rjy2551@syswawor2 | 2018/09/12 09:13:10 HAPRoxy: 10.255.0.7:57976 [12/Sep/2018:09:13:10.222] services monitoring_grafana-be3000_0/monitoring_grafana 0/0/0/47/48 200 2280 - - ---- 3/3/2/3/0 0/0 "GET /api/plugins?core=0&embedded=0 HTTP/1.1"
proxy_proxy.1.azw90rjy2551@syswawor2 | 2018/09/12 09:13:10 HAPRoxy: 10.255.0.11:40650 [12/Sep/2018:09:13:10.224] services monitoring_grafana-be3000_0/monitoring_grafana 0/0/0/99/99 200 947 - - ---- 3/3/1/2/0 0/0 "GET /api/search?dashboardIds=27&dashboardIds=47&dashboardIds=23&dashboardIds=24&limit=4 HTTP/1.1"
proxy_proxy.1.azw90rjy2551@syswawor2 | 2018/09/12 09:13:10 HAPRoxy: 10.255.60.195:51568 [12/Sep/2018:09:13:10.252] services monitoring_grafana-be3000_0/monitoring_grafana 0/0/0/77/77 200 939 - - ---- 3/3/0/1/0 0/0 "GET /api/search?limit=4&starred=true HTTP/1.1"
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:14:55 Starting HAProxy
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:14:55 Starting "Docker Flow: Proxy"
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:00 Got configuration from http://swarm-listener:8080.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:00 Creating configuration for the service monitoring_grafana
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:00 Creating configuration for the service mypadel-develop_mypadel
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:00 Creating configuration for the service playtomic-staging_static
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:00 Creating configuration for the service mypadel-develop_user
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:00 Creating configuration for the service visualizer_visualizer
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:00 Creating configuration for the service playtomic-develop_static
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:00 Reloading the proxy
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:00 Validating configuration
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091500 (33) : parsing [/cfg/haproxy.cfg:93] : 'server mypadel-develop_mypadel' : could not resolve address 'mypadel-develop_mypadel'.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | Exit Status: 1
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:00 Config validation failed. Will try again...
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:01 Validating configuration
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091501 (38) : parsing [/cfg/haproxy.cfg:93] : 'server mypadel-develop_mypadel' : could not resolve address 'mypadel-develop_mypadel'.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091501 (38) : parsing [/cfg/haproxy.cfg:100] : 'server mypadel-develop_user' : could not resolve address 'mypadel-develop_user'.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | Exit Status: 1
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:01 Config validation failed. Will try again...
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:02 Validating configuration
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091502 (43) : parsing [/cfg/haproxy.cfg:93] : 'server mypadel-develop_mypadel' : could not resolve address 'mypadel-develop_mypadel'.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091502 (43) : parsing [/cfg/haproxy.cfg:100] : 'server mypadel-develop_user' : could not resolve address 'mypadel-develop_user'.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091502 (43) : Failed to initialize server(s) addr.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | Exit Status: 1
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:02 Config validation failed. Will try again...
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:03 Validating configuration
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091503 (48) : parsing [/cfg/haproxy.cfg:93] : 'server mypadel-develop_mypadel' : could not resolve address 'mypadel-develop_mypadel'.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091503 (48) : parsing [/cfg/haproxy.cfg:100] : 'server mypadel-develop_user' : could not resolve address 'mypadel-develop_user'.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091503 (48) : Failed to initialize server(s) addr.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | Exit Status: 1
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:03 Config validation failed. Will try again...
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:04 Validating configuration
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091504 (53) : parsing [/cfg/haproxy.cfg:93] : 'server mypadel-develop_mypadel' : could not resolve address 'mypadel-develop_mypadel'.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091504 (53) : parsing [/cfg/haproxy.cfg:100] : 'server mypadel-develop_user' : could not resolve address 'mypadel-develop_user'.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091504 (53) : Failed to initialize server(s) addr.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | Exit Status: 1
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:05 Config validation failed. Will try again...
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:06 Validating configuration
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091506 (58) : parsing [/cfg/haproxy.cfg:93] : 'server mypadel-develop_mypadel' : could not resolve address 'mypadel-develop_mypadel'.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091506 (58) : parsing [/cfg/haproxy.cfg:100] : 'server mypadel-develop_user' : could not resolve address 'mypadel-develop_user'.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091506 (58) : Failed to initialize server(s) addr.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | Exit Status: 1
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:06 Config validation failed. Will try again...
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:07 Validating configuration
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091507 (63) : parsing [/cfg/haproxy.cfg:93] : 'server mypadel-develop_mypadel' : could not resolve address 'mypadel-develop_mypadel'.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | Exit Status: 1
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:07 Config validation failed. Will try again...
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091507 (63) : parsing [/cfg/haproxy.cfg:100] : 'server mypadel-develop_user' : could not resolve address 'mypadel-develop_user'.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091507 (63) : Failed to initialize server(s) addr.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:08 Validating configuration
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091508 (68) : parsing [/cfg/haproxy.cfg:93] : 'server mypadel-develop_mypadel' : could not resolve address 'mypadel-develop_mypadel'.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091508 (68) : parsing [/cfg/haproxy.cfg:100] : 'server mypadel-develop_user' : could not resolve address 'mypadel-develop_user'.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091508 (68) : Failed to initialize server(s) addr.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | Exit Status: 1
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:08 Config validation failed. Will try again...
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:09 Validating configuration
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091509 (73) : parsing [/cfg/haproxy.cfg:93] : 'server mypadel-develop_mypadel' : could not resolve address 'mypadel-develop_mypadel'.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091509 (73) : parsing [/cfg/haproxy.cfg:100] : 'server mypadel-develop_user' : could not resolve address 'mypadel-develop_user'.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091509 (73) : Failed to initialize server(s) addr.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | Exit Status: 1
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:09 Config validation failed. Will try again...
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:10 Validating configuration
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091510 (78) : parsing [/cfg/haproxy.cfg:93] : 'server mypadel-develop_mypadel' : could not resolve address 'mypadel-develop_mypadel'.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091510 (78) : parsing [/cfg/haproxy.cfg:100] : 'server mypadel-develop_user' : could not resolve address 'mypadel-develop_user'.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091510 (78) : Failed to initialize server(s) addr.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | Exit Status: 1
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:10 Config validation failed. Will try again...
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:11 Validating configuration
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091511 (83) : parsing [/cfg/haproxy.cfg:93] : 'server mypadel-develop_mypadel' : could not resolve address 'mypadel-develop_mypadel'.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091511 (83) : parsing [/cfg/haproxy.cfg:100] : 'server mypadel-develop_user' : could not resolve address 'mypadel-develop_user'.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | Exit Status: 1
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:11 Config validation failed. Will try again...
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:12 Validating configuration
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091512 (88) : parsing [/cfg/haproxy.cfg:93] : 'server mypadel-develop_mypadel' : could not resolve address 'mypadel-develop_mypadel'.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091512 (88) : parsing [/cfg/haproxy.cfg:100] : 'server mypadel-develop_user' : could not resolve address 'mypadel-develop_user'.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | Exit Status: 1
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:12 Config validation failed. Will try again...
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:13 Validating configuration
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091513 (93) : parsing [/cfg/haproxy.cfg:93] : 'server mypadel-develop_mypadel' : could not resolve address 'mypadel-develop_mypadel'.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091513 (93) : parsing [/cfg/haproxy.cfg:100] : 'server mypadel-develop_user' : could not resolve address 'mypadel-develop_user'.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | Exit Status: 1
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:13 Config validation failed. Will try again...
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:14 Validating configuration
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091514 (98) : parsing [/cfg/haproxy.cfg:93] : 'server mypadel-develop_mypadel' : could not resolve address 'mypadel-develop_mypadel'.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091514 (98) : parsing [/cfg/haproxy.cfg:100] : 'server mypadel-develop_user' : could not resolve address 'mypadel-develop_user'.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | Exit Status: 1
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:14 Config validation failed. Will try again...
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:15 Validating configuration
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091515 (103) : parsing [/cfg/haproxy.cfg:93] : 'server mypadel-develop_mypadel' : could not resolve address 'mypadel-develop_mypadel'.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091515 (103) : parsing [/cfg/haproxy.cfg:100] : 'server mypadel-develop_user' : could not resolve address 'mypadel-develop_user'.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091515 (103) : Failed to initialize server(s) addr.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | Exit Status: 1
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:15 Config validation failed. Will try again...
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:16 Validating configuration
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091516 (122) : parsing [/cfg/haproxy.cfg:93] : 'server mypadel-develop_mypadel' : could not resolve address 'mypadel-develop_mypadel'.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091516 (122) : parsing [/cfg/haproxy.cfg:100] : 'server mypadel-develop_user' : could not resolve address 'mypadel-develop_user'.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091516 (122) : Failed to initialize server(s) addr.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | Exit Status: 1
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:16 Config validation failed. Will try again...
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:17 Validating configuration
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091517 (131) : parsing [/cfg/haproxy.cfg:93] : 'server mypadel-develop_mypadel' : could not resolve address 'mypadel-develop_mypadel'.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091517 (131) : parsing [/cfg/haproxy.cfg:100] : 'server mypadel-develop_user' : could not resolve address 'mypadel-develop_user'.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091517 (131) : Failed to initialize server(s) addr.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | Exit Status: 1
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:17 Config validation failed. Will try again...
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:18 Validating configuration
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091518 (140) : parsing [/cfg/haproxy.cfg:93] : 'server mypadel-develop_mypadel' : could not resolve address 'mypadel-develop_mypadel'.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091518 (140) : parsing [/cfg/haproxy.cfg:100] : 'server mypadel-develop_user' : could not resolve address 'mypadel-develop_user'.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091518 (140) : Failed to initialize server(s) addr.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | Exit Status: 1
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:18 Config validation failed. Will try again...
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:19 Validating configuration
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091519 (149) : parsing [/cfg/haproxy.cfg:93] : 'server mypadel-develop_mypadel' : could not resolve address 'mypadel-develop_mypadel'.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091519 (149) : parsing [/cfg/haproxy.cfg:100] : 'server mypadel-develop_user' : could not resolve address 'mypadel-develop_user'.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091519 (149) : Failed to initialize server(s) addr.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | Exit Status: 1
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:19 Config validation failed. Will try again...
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:20 Validating configuration
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091520 (158) : parsing [/cfg/haproxy.cfg:93] : 'server mypadel-develop_mypadel' : could not resolve address 'mypadel-develop_mypadel'.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091520 (158) : parsing [/cfg/haproxy.cfg:100] : 'server mypadel-develop_user' : could not resolve address 'mypadel-develop_user'.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091520 (158) : Failed to initialize server(s) addr.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | Exit Status: 1
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:20 Config validation failed. Will try again...
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | 2018/09/12 09:15:21 Config validation failed
proxy_proxy.1.dhyy30v0gbmo@syswawor2 |
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | stdout:
proxy_proxy.1.dhyy30v0gbmo@syswawor2 |
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | stderr:
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091520 (158) : parsing [/cfg/haproxy.cfg:93] : 'server mypadel-develop_mypadel' : could not resolve address 'mypadel-develop_mypadel'.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091520 (158) : parsing [/cfg/haproxy.cfg:100] : 'server mypadel-develop_user' : could not resolve address 'mypadel-develop_user'.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | [ALERT] 254/091520 (158) : Failed to initialize server(s) addr.
proxy_proxy.1.dhyy30v0gbmo@syswawor2 |
proxy_proxy.1.dhyy30v0gbmo@syswawor2 |
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | global
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | pidfile /var/run/haproxy.pid
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | stats socket /var/run/haproxy.sock mode 660 level admin expose-fd listeners
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | tune.ssl.default-dh-param 2048
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | log 127.0.0.1:1514 local0
proxy_proxy.1.dhyy30v0gbmo@syswawor2 |
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | # disable sslv3, prefer modern ciphers
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | ssl-default-bind-options ssl-min-ver TLSv1.2 no-tls-tickets
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | ssl-default-bind-ciphers ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:EECDH+AESGCM:EDH+AESGCM
proxy_proxy.1.dhyy30v0gbmo@syswawor2 |
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | ssl-default-server-options ssl-min-ver TLSv1.2 no-tls-tickets
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | ssl-default-server-ciphers ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:EECDH+AESGCM:EDH+AESGCM
proxy_proxy.1.dhyy30v0gbmo@syswawor2 |
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | resolvers docker
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | nameserver dns 127.0.0.11:53
proxy_proxy.1.dhyy30v0gbmo@syswawor2 |
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | defaults
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | mode http
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | balance roundrobin
proxy_proxy.1.dhyy30v0gbmo@syswawor2 |
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | option http-keep-alive
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | option redispatch
proxy_proxy.1.dhyy30v0gbmo@syswawor2 |
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | errorfile 400 /errorfiles/400.http
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | errorfile 403 /errorfiles/403.http
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | errorfile 405 /errorfiles/405.http
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | errorfile 408 /errorfiles/408.http
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | errorfile 429 /errorfiles/429.http
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | errorfile 500 /errorfiles/500.http
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | errorfile 502 /errorfiles/502.http
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | errorfile 503 /errorfiles/503.http
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | errorfile 504 /errorfiles/504.http
proxy_proxy.1.dhyy30v0gbmo@syswawor2 |
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | maxconn 5000
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | timeout connect 5s
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | timeout client 20s
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | timeout server 20s
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | timeout queue 30s
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | timeout tunnel 3600s
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | timeout http-request 5s
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | timeout http-keep-alive 15s
proxy_proxy.1.dhyy30v0gbmo@syswawor2 |
proxy_proxy.1.dhyy30v0gbmo@syswawor2 |
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | frontend services
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | bind *:80
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | bind *:443
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | mode http
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | option forwardfor
proxy_proxy.1.dhyy30v0gbmo@syswawor2 |
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | option httplog
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | log global
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | acl url_mypadel-develop_mypadel8090_0 path_beg /health path_beg /v2/payments path_beg /v2/payment_methods path_beg /v2/availability path_beg /v2/suggestion path_beg /v2/user/accounts path_beg /v2/user/profile
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | use_backend mypadel-develop_mypadel-be8090_0 if url_mypadel-develop_mypadel8090_0
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | acl url_mypadel-develop_user8181_0 path_beg /v2/auth path_beg /v2/user/resend_validation_email path_beg /v2/users
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | use_backend mypadel-develop_user-be8181_0 if url_mypadel-develop_user8181_0
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | acl url_visualizer_visualizer8080_0 path_beg /visualizer
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | use_backend visualizer_visualizer-be8080_0 if url_visualizer_visualizer8080_0
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | acl url_monitoring_grafana3000_0 path_beg /
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | acl domain_monitoring_grafana3000_0 hdr_beg(host) -i grafana.syltek.com
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | use_backend monitoring_grafana-be3000_0 if url_monitoring_grafana3000_0 domain_monitoring_grafana3000_0
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | acl url_playtomic-develop_static80_0 path_beg /
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | acl domain_playtomic-develop_static80_0 hdr_end(host) -i playtomic-develop.syltek.com
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | use_backend playtomic-develop_static-be80_0 if url_playtomic-develop_static80_0 domain_playtomic-develop_static80_0
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | acl url_playtomic-staging_static80_0 path_beg /
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | acl domain_playtomic-staging_static80_0 hdr_end(host) -i playtomic-staging.syltek.com
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | use_backend playtomic-staging_static-be80_0 if url_playtomic-staging_static80_0 domain_playtomic-staging_static80_0
proxy_proxy.1.dhyy30v0gbmo@syswawor2 |
proxy_proxy.1.dhyy30v0gbmo@syswawor2 |
proxy_proxy.1.dhyy30v0gbmo@syswawor2 |
proxy_proxy.1.dhyy30v0gbmo@syswawor2 |
proxy_proxy.1.dhyy30v0gbmo@syswawor2 |
proxy_proxy.1.dhyy30v0gbmo@syswawor2 |
proxy_proxy.1.dhyy30v0gbmo@syswawor2 |
proxy_proxy.1.dhyy30v0gbmo@syswawor2 |
proxy_proxy.1.dhyy30v0gbmo@syswawor2 |
proxy_proxy.1.dhyy30v0gbmo@syswawor2 |
proxy_proxy.1.dhyy30v0gbmo@syswawor2 |
proxy_proxy.1.dhyy30v0gbmo@syswawor2 |
proxy_proxy.1.dhyy30v0gbmo@syswawor2 |
proxy_proxy.1.dhyy30v0gbmo@syswawor2 |
proxy_proxy.1.dhyy30v0gbmo@syswawor2 |
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | backend monitoring_grafana-be3000_0
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | mode http
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | http-request add-header X-Forwarded-Proto https if { ssl_fc }
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | log global
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | server monitoring_grafana monitoring_grafana:3000
proxy_proxy.1.dhyy30v0gbmo@syswawor2 |
proxy_proxy.1.dhyy30v0gbmo@syswawor2 |
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | backend mypadel-develop_mypadel-be8090_0
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | mode http
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | http-request add-header X-Forwarded-Proto https if { ssl_fc }
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | log global
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | server mypadel-develop_mypadel mypadel-develop_mypadel:8090
proxy_proxy.1.dhyy30v0gbmo@syswawor2 |
proxy_proxy.1.dhyy30v0gbmo@syswawor2 |
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | backend mypadel-develop_user-be8181_0
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | mode http
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | http-request add-header X-Forwarded-Proto https if { ssl_fc }
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | log global
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | server mypadel-develop_user mypadel-develop_user:8181
proxy_proxy.1.dhyy30v0gbmo@syswawor2 |
proxy_proxy.1.dhyy30v0gbmo@syswawor2 |
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | backend playtomic-develop_static-be80_0
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | mode http
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | http-request add-header X-Forwarded-Proto https if { ssl_fc }
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | log global
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | server playtomic-develop_static playtomic-develop_static:80
proxy_proxy.1.dhyy30v0gbmo@syswawor2 |
proxy_proxy.1.dhyy30v0gbmo@syswawor2 |
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | backend playtomic-staging_static-be80_0
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | mode http
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | http-request add-header X-Forwarded-Proto https if { ssl_fc }
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | log global
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | server playtomic-staging_static playtomic-staging_static:80
proxy_proxy.1.dhyy30v0gbmo@syswawor2 |
proxy_proxy.1.dhyy30v0gbmo@syswawor2 |
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | backend visualizer_visualizer-be8080_0
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | mode http
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | http-request add-header X-Forwarded-Proto https if { ssl_fc }
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | log global
proxy_proxy.1.dhyy30v0gbmo@syswawor2 | server visualizer_visualizer visualizer_visualizer:8080
Is the mypadel-develop_mypadel
and mypadel-develop_user
service running? If it is running, that error suggest that DFP is not able to detect it over the overlay network.
Can you post a simple example using docker-machine
to reproduce this issue? This would help iron out the configuration details in order to recreate your issue. Specifically, I want to see what constraints were placed on the services.
Yes, all services are running. I don't understand what you mean by "a simple example using docker-machine". We don't use docker-machine at all in our environment.
If it's a problem with the overlay network, why does everything start working again when I exec into the DFP container and restart the haproxy instance?
I'll do a test later today to ping the services from the DFP container and see if they are reachable.
Just to clarify, mypadel-develop_mypadel, mypadel-develop_user are services that finish starting after the DFP is up and running. That's why you see that in the log.
I don't understand what you mean by "a simple example using docker-machine". We don't use docker-machine at all in our environment.
Since everyones environment is different, using docker-machine
and virtualbox helps with reproducing the issue. This issue has to do handling when servers go down. One way to simulate this on a development machine is to use multiple VMs.
I ask for a concrete example, because this issue has many states and configuration options. It would also answer a bunch of questions:
visualizer_visualizer
which are okay. What about your configuration makes reachable?Here is what I think the issue is
Case 1
docker service update --force
on DFP will ask it to request for all services again. This time it works since the services are up.If I filter out the down services from step 3, I need to make sure DFSL will send notifications about the service when it comes back up.
Case 2
docker service update --force
on DFP will ask it to request for all services again. This time it works since the services are up.This is the tougher case, currently DFSL is also able to listen to node events. So when a node fails, DFSL can check if all the services are okay, and update DFP. This would need to happen all at once, since the state change is not gradual.
Let me answer some of your questions:
Are your services constraint to running on worker nodes? Yes
If they are constrained, is DFP handling this situation correctly? I don't know what you mean
After the nodes go down, is DFP still working for services that are still running? In this particular scenario all worker nodes are reboot at the same time, so no services are up at that point.
After the nodes go up, is DFP still working for services that are still running? No, DFP always returns 503
Does this issue only happen with 10-15 services or is 1-2 services enough? We've been able to reproduce it with 10 plus services. We've also noticed that in dev environment is much easier to reproduce than production because dev is much slower restarting services all at once. Maybe a race condition that happens more frequently in slow environments?
From the logs there are other services such as visualizer_visualizer which are okay. What about your configuration makes reachable? I don't really know what you mean
I hope that helps
Another solution would be to add option httpchk
to the DFP's haproxy config, and a healthcheck to your services, so haproxy can determine when a service is offline. If you are using TCP, a tcp check would also work.
@rubensyltek Do you think a healthcheck in DFP would help with your use case?
Hi,
We've decided not to keep on using docker flow proxy as part of our architecture. I think it's a great idea but almost every problem we've had in production so far has been either provoked by DFP or DFP made the situation even more difficult to analyze and fix.
Our architecture is built to minimize single point of failure components and DFP is not robust enough for what we need right now.
Regarding this issue, as I've said, it's pretty simple to reproduce. If you create some services whose start-up time is high you will be able to reproduce this problem fairly easily.
Thanks a lot!
Even I am having the same issue. DFP works sometime, sometime it keeps returning 503 and timeout, after a restart of the nodes it gets back again. But it is really painful that DFP is not stable yet. If we have any pointers or workaround to solve this, that would really help.
Used Docker Engine- 17.03 ce Compose file used below
version: "3"
services:
proxy:
image: dockerflow/docker-flow-proxy:${TAG:-latest}
ports:
- 8080:80
- 443:443
networks:
- proxy
environment:
- LISTENER_ADDRESS=swarm-listener
- MODE=swarm
- CONNECTION_MODE=${CONNECTION_MODE:-http-keep-alive}
deploy:
replicas: 3
swarm-listener:
image: vfarcic/docker-flow-swarm-listener
networks:
- proxy
volumes:
- /var/run/docker.sock:/var/run/docker.sock
environment:
- DF_NOTIFY_CREATE_SERVICE_URL=http://proxy:8080/v1/docker-flow-proxy/reconfigure
- DF_NOTIFY_REMOVE_SERVICE_URL=http://proxy:8080/v1/docker-flow-proxy/remove
deploy:
placement:
constraints: [node.role == manager]
networks:
proxy:
external: true
Can you test out using dockerflow/docker-flow-swarm-listener:18.10.12-7
? There has been several updates that help stabilize DFP.
So thats the new proxy compose I have used and tested
version: "3"
services:
proxy:
image: dockerflow/docker-flow-proxy:18.10.09-13
ports:
- 8080:80
- 443:443
networks:
- proxy
environment:
- LISTENER_ADDRESS=swarm-listener
- MODE=swarm
- CONNECTION_MODE=${CONNECTION_MODE:-http-keep-alive}
deploy:
replicas: 3
swarm-listener:
image: dockerflow/docker-flow-swarm-listener:18.10.12-7
networks:
- proxy
volumes:
- /var/run/docker.sock:/var/run/docker.sock
environment:
- DF_NOTIFY_CREATE_SERVICE_URL=http://proxy:8080/v1/docker-flow-proxy/reconfigure
- DF_NOTIFY_REMOVE_SERVICE_URL=http://proxy:8080/v1/docker-flow-proxy/remove
deploy:
placement:
constraints: [node.role == manager]
networks:
proxy:
external: true
It behaves quite intermittently. Some request gets 200 and some 504.
<html>
<head>
<!-- Bootstrap -->
<link rel="stylesheet" type="text/css" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/css/bootstrap.min.css">
<style>
body {
padding-top: 50px
}
</style>
</head>
<body>
<div class="container">
<div class="panel panel-warning">
<div class="panel-heading">
<h3 class="panel-title">Docker Flow Proxy: 504 Gateway Timeout</h3>
</div>
<div class="panel-body">
No server is available to handle this request.
</div>
</div>
</body>
</html>
Just want to chime in here to note we've seen the same issue in our swarms even though we are running the latest versions of DFP and DFSL, 18.10.19-14 and 18.10.12-7 respectively.
After performing an upgrade to 18.03 yesterday DFP failed to come up cleanly. Seems it was due to the fact that another service in the swarm failed start, and DFP couldn't find it.
DFP was repeating errors like this until I removed the offending service. [ALERT] 296/193521 (18366) : parsing [/cfg/haproxy.cfg:168] : 'server home_service' : could not resolve address 'home_service'. [ALERT] 296/193521 (18366) : Failed to initialize server(s) addr. Exit Status: 1
That service failed to come up due to an unrelated issue with it's docker image being unavailable, so it was sitting there with 0/1 replicas.
This is a real concern for us as it turns DFP into a single point of failure in our upgrade path, unless we can absolutely guarantee that every service in our cluster will come up successfully (maybe before DFP?)
Ideally a failure to find one backend in DFP would not cause failures for the other services.
@maevyn11
All the updates made to DFSL were attempts to never let DFP go into a bad state. The newest DFSL release checks if the service is running before sending it to DFP. It looks like there are still ways for DFSL to notify DFP of a bad state.
To help debug this issue, I need you guys to look at the DFSL logs and see when it sends the notification to DFP about non-running service. It would help to know what triggered this notification.
@thomasjpfan That makes sense. I will see if I can reproduce the issue later on today and collect the DFSL logs.
@thomasjpfan
Managed to reproduce this just now, although not exactly in the way I intended. What I did was scaled the proxy down to one replica, and then terminated two workers in the swarm. When the proxy came back up, it was stuck trying to connect to the services that had also been moved when the workers were terminated.
Same type of log pattern as I mentioned earlier was seen, with the entries including each service that had not yet gotten into a 'running' state.
Here's what the proxy logs looked like upon starting up.
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | 2018/10/26 18:56:55 Starting HAProxy
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | 2018/10/26 18:56:56 Starting "Docker Flow: Proxy"
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | 2018/10/26 18:57:01 Got configuration from http://swarm-listener:8080.
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | 2018/10/26 18:57:01 Creating configuration for the service rethink_ui
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | 2018/10/26 18:57:01 Creating configuration for the service keycloak_keycloak
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | 2018/10/26 18:57:01 Creating configuration for the service notary_server
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | 2018/10/26 18:57:01 Creating configuration for the service monitor_monitor
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | 2018/10/26 18:57:01 Creating configuration for the service jupyter_jupyter
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | 2018/10/26 18:57:01 Creating configuration for the service nexus_haproxy
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | 2018/10/26 18:57:01 Creating configuration for the service logging_kibana
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | 2018/10/26 18:57:01 Creating configuration for the service swarmviz_dashboard
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | 2018/10/26 18:57:01 Creating configuration for the service grafana_grafana
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | 2018/10/26 18:57:01 Removing job-service_mongo-express configuration
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | 2018/10/26 18:57:01 job-service_mongo-express was not configured, no reload required
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | 2018/10/26 18:57:01 Removing job-service_job-site configuration
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | 2018/10/26 18:57:01 job-service_job-site was not configured, no reload required
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | 2018/10/26 18:57:01 Creating configuration for the service gridplus-energy-wallet-api_gridplus-energy-wallet-api
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | 2018/10/26 18:57:01 Creating configuration for the service denormalizer-api_prisma
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | 2018/10/26 18:57:01 Creating configuration for the service nexus_nexus
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | 2018/10/26 18:57:01 Creating configuration for the service openkb_openkb
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | 2018/10/26 18:57:01 Creating configuration for the service home_service
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | 2018/10/26 18:57:01 Creating configuration for the service servicebus_rmq1
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | [WARNING] 298/185701 (29) : parsing [/cfg/haproxy.cfg:60] : a 'http-request' rule placed after a 'use_backend' rule will still be processed before.
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | [WARNING] 298/185701 (29) : parsing [/cfg/haproxy.cfg:77] : a 'http-request' rule placed after a 'use_backend' rule will still be processed before.
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | [WARNING] 298/185701 (29) : parsing [/cfg/haproxy.cfg:85] : a 'http-request' rule placed after a 'use_backend' rule will still be processed before.
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | 2018/10/26 18:57:01 Reloading the proxy
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | 2018/10/26 18:57:01 Validating configuration
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | [ALERT] 298/185701 (29) : parsing [/cfg/haproxy.cfg:138] : 'server denormalizer-api_prisma' : could not resolve address 'denormalizer-api_prisma'.
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | [ALERT] 298/185701 (29) : parsing [/cfg/haproxy.cfg:156] : 'server home_service' : could not resolve address 'home_service'.
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | [ALERT] 298/185701 (29) : parsing [/cfg/haproxy.cfg:162] : 'server jupyter_jupyter' : could not resolve address 'jupyter_jupyter'.
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | [ALERT] 298/185701 (29) : parsing [/cfg/haproxy.cfg:168] : 'server keycloak_keycloak' : could not resolve address 'keycloak_keycloak'.
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | [ALERT] 298/185701 (29) : parsing [/cfg/haproxy.cfg:199] : 'server notary_server' : could not resolve address 'notary_server'.
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | Exit Status: 1
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | 2018/10/26 18:57:02 Config validation failed. Will try again...
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | [WARNING] 298/185703 (38) : parsing [/cfg/haproxy.cfg:60] : a 'http-request' rule placed after a 'use_backend' rule will still be processed before.
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | [WARNING] 298/185703 (38) : parsing [/cfg/haproxy.cfg:77] : a 'http-request' rule placed after a 'use_backend' rule will still be processed before.
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | [WARNING] 298/185703 (38) : parsing [/cfg/haproxy.cfg:85] : a 'http-request' rule placed after a 'use_backend' rule will still be processed before.
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | 2018/10/26 18:57:03 Validating configuration
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | [ALERT] 298/185703 (38) : parsing [/cfg/haproxy.cfg:138] : 'server denormalizer-api_prisma' : could not resolve address 'denormalizer-api_prisma'.
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | [ALERT] 298/185703 (38) : parsing [/cfg/haproxy.cfg:156] : 'server home_service' : could not resolve address 'home_service'.
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | [ALERT] 298/185703 (38) : parsing [/cfg/haproxy.cfg:162] : 'server jupyter_jupyter' : could not resolve address 'jupyter_jupyter'.
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | [ALERT] 298/185703 (38) : parsing [/cfg/haproxy.cfg:168] : 'server keycloak_keycloak' : could not resolve address 'keycloak_keycloak'.
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | [ALERT] 298/185703 (38) : parsing [/cfg/haproxy.cfg:199] : 'server notary_server' : could not resolve address 'notary_server'.
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | Exit Status: 1
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | 2018/10/26 18:57:05 Config validation failed. Will try again...
And here's an example from the service which took the longest to move over, and therefore was the last thing to keep the proxy from starting cleanly.
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | [ALERT] 298/190205 (820) : Failed to initialize server(s) addr.
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | Exit Status: 1
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | 2018/10/26 19:02:05 Config validation failed. Will try again...
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | 2018/10/26 19:02:06 Validating configuration
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | [WARNING] 298/190206 (821) : parsing [/cfg/haproxy.cfg:60] : a 'http-request' rule placed after a 'use_backend' rule will still be processed before.
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | [WARNING] 298/190206 (821) : parsing [/cfg/haproxy.cfg:77] : a 'http-request' rule placed after a 'use_backend' rule will still be processed before.
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | [WARNING] 298/190206 (821) : parsing [/cfg/haproxy.cfg:85] : a 'http-request' rule placed after a 'use_backend' rule will still be processed before.
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | [WARNING] 298/190206 (821) : parsing [/cfg/haproxy.cfg:102] : a 'http-request' rule placed after a 'use_backend' rule will still be processed before.
proxy_proxy.1.rh84ru09aryp@ip-172-31-34-8.ec2.internal | [ALERT] 298/190206 (821) : parsing [/cfg/haproxy.cfg:169] : 'server jupyter_jupyter' : could not resolve address 'jupyter_jupyter'.
As requested, here are the DFSL logs from the incident. The first log entry is from when I scaled the proxy down to 1 replica. The following entries begin once I terminated the two workers. this is the entirety of the DFSL log output
proxy_swarm-listener.1.muu54flq5s41@ip-172-31-28-30.ec2.internal | 2018/10/26 18:52:16 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.8&alertName=memlimit&distribute=true&replicas=1&serviceName=proxy_proxy
proxy_swarm-listener.1.muu54flq5s41@ip-172-31-28-30.ec2.internal | 2018/10/26 18:57:01 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit%3A0.8&alertName=memlimit&distribute=true&port=8080&redirectWhenHttpProto=true&replicas=1&serviceDomain=signing.staging-gridpl.us&serviceName=signing-api-proxy_service
proxy_swarm-listener.1.muu54flq5s41@ip-172-31-28-30.ec2.internal | 2018/10/26 18:57:01 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.8&alertName=memlimit&distribute=true&port=3000&replicas=1&serviceDomain=website.staging-gridpl.us&serviceName=website_service
proxy_swarm-listener.1.muu54flq5s41@ip-172-31-28-30.ec2.internal | 2018/10/26 18:57:01 Retrying service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.8&alertName=memlimit&distribute=true&port=3000&replicas=1&serviceDomain=website.staging-gridpl.us&serviceName=website_service (1 try)
proxy_swarm-listener.1.muu54flq5s41@ip-172-31-28-30.ec2.internal | 2018/10/26 18:57:01 Retrying service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit%3A0.8&alertName=memlimit&distribute=true&port=8080&redirectWhenHttpProto=true&replicas=1&serviceDomain=signing.staging-gridpl.us&serviceName=signing-api-proxy_service (1 try)
proxy_swarm-listener.1.muu54flq5s41@ip-172-31-28-30.ec2.internal | 2018/10/26 18:57:06 Retrying service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit%3A0.8&alertName=memlimit&distribute=true&port=8080&redirectWhenHttpProto=true&replicas=1&serviceDomain=signing.staging-gridpl.us&serviceName=signing-api-proxy_service (2 try)
proxy_swarm-listener.1.muu54flq5s41@ip-172-31-28-30.ec2.internal | 2018/10/26 18:57:06 Retrying service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.8&alertName=memlimit&distribute=true&port=3000&replicas=1&serviceDomain=website.staging-gridpl.us&serviceName=website_service (2 try)
proxy_swarm-listener.1.muu54flq5s41@ip-172-31-28-30.ec2.internal | 2018/10/26 18:57:11 Retrying service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.8&alertName=memlimit&distribute=true&port=3000&replicas=1&serviceDomain=website.staging-gridpl.us&serviceName=website_service (3 try)
proxy_swarm-listener.1.muu54flq5s41@ip-172-31-28-30.ec2.internal | 2018/10/26 18:57:11 Retrying service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit%3A0.8&alertName=memlimit&distribute=true&port=8080&redirectWhenHttpProto=true&replicas=1&serviceDomain=signing.staging-gridpl.us&serviceName=signing-api-proxy_service (3 try)
proxy_swarm-listener.1.muu54flq5s41@ip-172-31-28-30.ec2.internal | 2018/10/26 18:57:16 Retrying service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit%3A0.8&alertName=memlimit&distribute=true&port=8080&redirectWhenHttpProto=true&replicas=1&serviceDomain=signing.staging-gridpl.us&serviceName=signing-api-proxy_service (4 try)
proxy_swarm-listener.1.muu54flq5s41@ip-172-31-28-30.ec2.internal | 2018/10/26 18:57:16 Retrying service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.8&alertName=memlimit&distribute=true&port=3000&replicas=1&serviceDomain=website.staging-gridpl.us&serviceName=website_service (4 try)
proxy_swarm-listener.1.muu54flq5s41@ip-172-31-28-30.ec2.internal | 2018/10/26 18:57:21 Retrying service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.8&alertName=memlimit&distribute=true&port=3000&replicas=1&serviceDomain=website.staging-gridpl.us&serviceName=website_service (5 try)
proxy_swarm-listener.1.muu54flq5s41@ip-172-31-28-30.ec2.internal | 2018/10/26 18:57:21 Retrying service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit%3A0.8&alertName=memlimit&distribute=true&port=8080&redirectWhenHttpProto=true&replicas=1&serviceDomain=signing.staging-gridpl.us&serviceName=signing-api-proxy_service (5 try)
proxy_swarm-listener.1.muu54flq5s41@ip-172-31-28-30.ec2.internal | 2018/10/26 18:57:26 Retrying service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.8&alertName=memlimit&distribute=true&port=3000&replicas=1&serviceDomain=website.staging-gridpl.us&serviceName=website_service (6 try)
proxy_swarm-listener.1.muu54flq5s41@ip-172-31-28-30.ec2.internal | 2018/10/26 18:57:26 Retrying service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit%3A0.8&alertName=memlimit&distribute=true&port=8080&redirectWhenHttpProto=true&replicas=1&serviceDomain=signing.staging-gridpl.us&serviceName=signing-api-proxy_service (6 try)
proxy_swarm-listener.1.muu54flq5s41@ip-172-31-28-30.ec2.internal | 2018/10/26 18:57:31 Retrying service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.8&alertName=memlimit&distribute=true&port=3000&replicas=1&serviceDomain=website.staging-gridpl.us&serviceName=website_service (7 try)
proxy_swarm-listener.1.muu54flq5s41@ip-172-31-28-30.ec2.internal | 2018/10/26 18:57:31 Retrying service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit%3A0.8&alertName=memlimit&distribute=true&port=8080&redirectWhenHttpProto=true&replicas=1&serviceDomain=signing.staging-gridpl.us&serviceName=signing-api-proxy_service (7 try)
proxy_swarm-listener.1.muu54flq5s41@ip-172-31-28-30.ec2.internal | 2018/10/26 18:57:41 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.8&alertName=memlimit&distribute=true&replicas=1&serviceName=proxy_proxy
proxy_swarm-listener.1.muu54flq5s41@ip-172-31-28-30.ec2.internal | 2018/10/26 18:58:48 Retrying service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit%3A0.8&alertName=memlimit&distribute=true&port=8080&redirectWhenHttpProto=true&replicas=1&serviceDomain=signing.staging-gridpl.us&serviceName=signing-api-proxy_service (8 try)
proxy_swarm-listener.1.muu54flq5s41@ip-172-31-28-30.ec2.internal | 2018/10/26 18:59:14 Retrying service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.8&alertName=memlimit&distribute=true&port=3000&replicas=1&serviceDomain=website.staging-gridpl.us&serviceName=website_service (8 try)
proxy_swarm-listener.1.muu54flq5s41@ip-172-31-28-30.ec2.internal | 2018/10/26 19:00:23 Retrying service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit%3A0.8&alertName=memlimit&distribute=true&port=8080&redirectWhenHttpProto=true&replicas=1&serviceDomain=signing.staging-gridpl.us&serviceName=signing-api-proxy_service (9 try)
proxy_swarm-listener.1.muu54flq5s41@ip-172-31-28-30.ec2.internal | 2018/10/26 19:00:45 Retrying service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.8&alertName=memlimit&distribute=true&port=3000&replicas=1&serviceDomain=website.staging-gridpl.us&serviceName=website_service (9 try)
proxy_swarm-listener.1.muu54flq5s41@ip-172-31-28-30.ec2.internal | 2018/10/26 19:01:50 Retrying service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit%3A0.8&alertName=memlimit&distribute=true&port=8080&redirectWhenHttpProto=true&replicas=1&serviceDomain=signing.staging-gridpl.us&serviceName=signing-api-proxy_service (10 try)
proxy_swarm-listener.1.muu54flq5s41@ip-172-31-28-30.ec2.internal | 2018/10/26 19:02:11 Retrying service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.8&alertName=memlimit&distribute=true&port=3000&replicas=1&serviceDomain=website.staging-gridpl.us&serviceName=website_service (10 try)
proxy_swarm-listener.1.muu54flq5s41@ip-172-31-28-30.ec2.internal | 2018/10/26 19:03:14 Retrying service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit%3A0.8&alertName=memlimit&distribute=true&port=8080&redirectWhenHttpProto=true&replicas=1&serviceDomain=signing.staging-gridpl.us&serviceName=signing-api-proxy_service (11 try)
At first glance, it looks to me like DFSL is not actually sending the requests to the proxy to configure these backends, as I don't see the problematic services in the log output of DFSL from the time the worker nodes were terminated.
So is DFP picking up it's haproxy configuration from somewhere else? Curious to hear your thoughts, hopefully this helps shed some light on the issue.
When DFP first starts up, it queries DFSL for all running services. From your DFP logs, it seems like DFSL is still considering jupyter_jupyter
as "running" and thus configures DFP incorrectly.
There are two updates that needs to be made to DFSL:
Update what DFSL considers "running". Right now, it uses the docker tasks API, and checks if all the tasks are running. This does not seem like it is enough. One would need to do research about the task api and how it changes when node goes down. (The task api is pretty undocumented, thus one needs to play around with it to see how it operates.)
When a node goes down, DFSL needs to send a update DFP about ALL services. DFSL is already able to listen to Node events. One would need to research, what node event to look to figure out when a node goes down.
Code wise, both changes are not too difficult to do. The time intensive part is the research about the docker API when nodes go down. Basically watching docker events
while bring nodes down in varies ways, and seeing how the docker API sees the services when nodes go down.
The task logic needs to consider more carefully about the number of task that needs to be running. https://github.com/docker-flow/docker-flow-swarm-listener/blob/b1af7e5a1ac3e9aba39e2eead25703e43f2a0f1a/service/task.go#L184
The node events that happens when a node is created, goes down, and up again is as follows:
(name=swarmnode2, state.new=ready, state.old=unknown)
(name=swarmnode2, state.new=down, state.old=ready)
(name=swarmnode2, state.new=ready, state.old=down)
Hmm...tricky. Makes me think, we have had an issue in our Prod Swarm where docker events are not propagating around the swarm properly, so our DFSL relies on polling in that environment. Of course that introduces a further delay in when DFSL can accurately update the state of the swarm.
Maybe there is an alternate solution, where DFP detects that one of its configured backends is unreachable and purges that entry from its config until the service is reachable again?
Since DFP does not have access to the docker api, it would not be able to know when a service is fully ready. For example, if service is at replicas == 1/2, the service is not fully up yet and DFP would not be able to tell the difference.
The traditional way to handle this is to configure haproxy to check the services to see if its up. This would be a totally new feature.
Anyways, I think I have a handle on what needs to change in DFSL. I should be able to get to it sometime this weekend.
Item 2 is going to take some time to implement. When a node goes down, DFSL needs to check what services went down, and send REMOVE notifications to DFP to remove them from haproxy. Next DFSL needs to wait for the services to come back up, and send CREATE notifications to DFP.
Makes sense, happy to test as needed.
Both items are implemented. I have tested DFSL out with vagrant by stopping and starting virtual machines where services that are slow to come up. Can you test out your use case with dockerflow/docker-flow-swarm-listener18.10.27-9
?
Just tested with 18.10.27-9 and unfortunately I'm still seeing the same issue.
After terminating the node containing the instance of the proxy, the proxy did not come up successfully until all of the other services on the terminated node came up as well. In this case our longest startup time service, jupyter, happened to be on the same node as the proxy, so just terminating that one node sufficed to reproduce the issue.
Here are the logs from DFSL after terminating the node.
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:01:49 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.9&alertName=memlimit&distribute=true&port=4466&replicas=1&serviceDomain=denormalizer-api.staging-gridpl.us&serviceName=denormalizer-api_prisma
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:01:49 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=1m&alertIf=%40service_mem_limit_nobuff%3A0.80&alertName=memlimit&distribute=true&replicas=1&serviceName=sftp_service
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:01:49 Retrying service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=1m&alertIf=%40service_mem_limit_nobuff%3A0.80&alertName=memlimit&distribute=true&replicas=1&serviceName=sftp_service (1 try)
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:01:49 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.8&alertName=memlimit&distribute=true&port=8888&replicas=1&serviceDomain=jupyter.staging-gridpl.us&serviceName=jupyter_jupyter
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:01:49 Retrying service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.9&alertName=memlimit&distribute=true&port=4466&replicas=1&serviceDomain=denormalizer-api.staging-gridpl.us&serviceName=denormalizer-api_prisma (1 try)
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:01:49 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.8&alertName=memlimit&distribute=true&replicas=1&serviceName=notary_signer
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:01:49 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.8&alertName=memlimit&distribute=true&port=9000&replicas=2&serviceDomain=docker.staging-gridpl.us&serviceName=nexus_haproxy&timeoutServer=1800
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:01:49 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit%3A0.8&alertName=memlimit&distribute=true&port=8080&redirectWhenHttpProto=true&replicas=1&serviceDomain=signing.staging-gridpl.us&serviceName=signing-api-proxy_service
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:01:49 Retrying service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit%3A0.8&alertName=memlimit&distribute=true&port=8080&redirectWhenHttpProto=true&replicas=1&serviceDomain=signing.staging-gridpl.us&serviceName=signing-api-proxy_service (1 try)
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:01:49 Retrying service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.8&alertName=memlimit&distribute=true&port=8888&replicas=1&serviceDomain=jupyter.staging-gridpl.us&serviceName=jupyter_jupyter (1 try)
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:01:49 Retrying service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.8&alertName=memlimit&distribute=true&replicas=1&serviceName=notary_signer (1 try)
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:01:49 Retrying service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.8&alertName=memlimit&distribute=true&port=9000&replicas=2&serviceDomain=docker.staging-gridpl.us&serviceName=nexus_haproxy&timeoutServer=1800 (1 try)
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:01:57 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.8&alertName=memlimit&distribute=true&replicas=1&serviceName=proxy_proxy
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:03:12 Retrying service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.9&alertName=memlimit&distribute=true&port=4466&replicas=1&serviceDomain=denormalizer-api.staging-gridpl.us&serviceName=denormalizer-api_prisma (2 try)
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:03:13 Retrying service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit%3A0.8&alertName=memlimit&distribute=true&port=8080&redirectWhenHttpProto=true&replicas=1&serviceDomain=signing.staging-gridpl.us&serviceName=signing-api-proxy_service (2 try)
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:03:14 Retrying service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.8&alertName=memlimit&distribute=true&port=8888&replicas=1&serviceDomain=jupyter.staging-gridpl.us&serviceName=jupyter_jupyter (2 try)
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:03:46 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.8&alertName=memlimit&distribute=true&replicas=1&serviceName=rethink_write
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:04:20 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.875&alertName=memlimit&distribute=true&port=5601&replicas=1&serviceDomain=logs.staging-gridpl.us&serviceName=logging_kibana
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:04:22 Retrying service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.9&alertName=memlimit&distribute=true&port=4466&replicas=1&serviceDomain=denormalizer-api.staging-gridpl.us&serviceName=denormalizer-api_prisma (3 try)
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:04:23 Retrying service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit%3A0.8&alertName=memlimit&distribute=true&port=8080&redirectWhenHttpProto=true&replicas=1&serviceDomain=signing.staging-gridpl.us&serviceName=signing-api-proxy_service (3 try)
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:04:24 Retrying service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.8&alertName=memlimit&distribute=true&port=8888&replicas=1&serviceDomain=jupyter.staging-gridpl.us&serviceName=jupyter_jupyter (3 try)
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:04:41 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.85&alertName=memlimit&distribute=true&port=80&replicas=1&serviceName=logging_elasticsearch
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:04:51 Retrying service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.8&alertName=memlimit&distribute=true&port=8888&replicas=1&serviceDomain=jupyter.staging-gridpl.us&serviceName=jupyter_jupyter (4 try)
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:05:18 Retrying service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.8&alertName=memlimit&distribute=true&port=8888&replicas=1&serviceDomain=jupyter.staging-gridpl.us&serviceName=jupyter_jupyter (5 try)
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:05:45 Retrying service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.8&alertName=memlimit&distribute=true&port=8888&replicas=1&serviceDomain=jupyter.staging-gridpl.us&serviceName=jupyter_jupyter (6 try)
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:05:47 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.8&alertName=memlimit&distribute=true&port=9000&replicas=2&serviceDomain=docker.staging-gridpl.us&serviceName=nexus_haproxy&timeoutServer=1800
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:05:47 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.9&alertName=memlimit&distribute=true&port=3000&redirectWhenHttpProto=true&replicas=1&serviceDomain=job-service.staging-gridpl.us&serviceName=job-service_job-site
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:05:47 ERROR: service update paused: update paused due to failure or early termination of task i7pbn908w7uz0ay85pi1uu1es
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:05:47 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.8&alertName=memlimit&distribute=true&port=4443&replicas=1&serviceDomain=notary.staging-gridpl.us&serviceName=notary_server
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:05:47 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.8&alertName=memlimit&distribute=true&port=3000&redirectWhenHttpProto=true&replicas=1&serviceDomain=gridplus-energy-wallet-api.staging-gridpl.us&serviceName=gridplus-energy-wallet-api_gridplus-energy-wallet-api
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:05:47 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.8&alertName=memlimit&distribute=true&port=3000&replicas=1&serviceDomain=website.staging-gridpl.us&serviceName=website_service
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:05:47 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.9&alertName=memlimit&distribute=true&replicas=1&serviceName=job-service_mongo
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:05:47 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.8&alertName=memlimit&distribute=true&letsencrypt.email=&letsencrypt.host=rabbitmq.staging-gridpl.us&port=15672&replicas=1&serviceDomain=rabbitmq.staging-gridpl.us&serviceName=servicebus_rmq1
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:05:47 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.9&alertName=memlimit&distribute=true&port=4466&replicas=1&serviceDomain=denormalizer-api.staging-gridpl.us&serviceName=denormalizer-api_prisma
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:05:47 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.9&alertName=memlimit&distribute=true&port=8081&redirectWhenHttpProto=true&replicas=1&serviceDomain=job-mongo.staging-gridpl.us&serviceName=job-service_mongo-express
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:05:47 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.8&alertName=memlimit&distribute=true&port=8888&replicas=1&serviceDomain=jupyter.staging-gridpl.us&serviceName=jupyter_jupyter
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:05:47 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.8&alertName=memlimit&distribute=true&replicas=1&serviceName=notary_signer
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:05:47 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit%3A0.8&alertName=memlimit&distribute=true&port=8080&redirectWhenHttpProto=true&replicas=1&serviceDomain=signing.staging-gridpl.us&serviceName=signing-api-proxy_service
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:05:47 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=1m&alertIf=%40service_mem_limit_nobuff%3A0.80&alertName=memlimit&distribute=true&replicas=1&serviceName=sftp_service
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:05:50 Canceling service create notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.8&alertName=memlimit&distribute=true&port=8888&replicas=1&serviceDomain=jupyter.staging-gridpl.us&serviceName=jupyter_jupyter
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:05:52 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=1m&alertIf=%40service_mem_limit_nobuff%3A0.80&alertName=memlimit&distribute=true&replicas=1&serviceName=sftp_intelometry-file-watcher
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:05:52 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=1m&alertIf=%40service_mem_limit_nobuff%3A0.90&alertName=memlimit&distribute=true&replicas=1&serviceName=postgres_postgres
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:05:52 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.8&alertName=memlimit&distribute=true&replicas=1&serviceName=servicebus_redis
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:05:52 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.8&alertName=memlimit&distribute=true&replicas=1&serviceName=proxy_proxy
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:05:52 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.85&alertName=memlimit&distribute=true&port=80&replicas=1&serviceName=logging_elasticsearch
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:05:52 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.80&alertName=memlimit&distribute=true&replicas=1&serviceName=release-catalog-api_redis
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:05:52 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.9&alertName=memlimit&distribute=true&port=8085&replicas=1&serviceDomain=swarmviz.staging-gridpl.us&serviceName=swarmviz_dashboard
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:05:52 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?distribute=true&replicas=1&scrapePort=9101&serviceName=exporters_ha-proxy
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:05:52 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.8&alertName=memlimit&distribute=true&replicas=1&serviceName=monitor_alertmanager
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:05:52 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.8&alertName=memlimit&distribute=true&replicas=1&serviceName=rethink_write
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:05:52 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.8&alertName=memlimit_nocache&distribute=true&port=8081&redirectWhenHttpProto=true&replicas=1&serviceDomain=nexus.staging-gridpl.us&serviceName=nexus_nexus
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:05:52 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.875&alertName=memlimit&distribute=true&port=5601&replicas=1&serviceDomain=logs.staging-gridpl.us&serviceName=logging_kibana
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:05:52 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=1m&alertIf=%40service_mem_limit_nobuff%3A0.80&alertName=memlimit&distribute=true&replicas=1&serviceName=sql-server_service
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:05:52 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=1m&alertIf=%40service_mem_limit_nobuff%3A0.80&alertName=memlimit&distribute=true&replicas=1&serviceName=sftp_five9-file-watcher
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:05:53 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.8&alertName=memlimit&distribute=true&port=3000&replicas=1&serviceDomain=grafana.staging-gridpl.us&serviceName=grafana_grafana
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:05:53 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit%3A0.85&alertName=memlimit&distribute=true&port=4444&replicas=1&serviceDomain=openkb.staging-gridpl.us&serviceName=openkb_openkb
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:05:53 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit%3A0.8&alertName=memlimit&distribute=true&port=3000&replicas=1&serviceDomain=home.staging-gridpl.us&serviceName=home_service
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:05:53 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.8&alertName=memlimit&distribute=true&replicas=1&serviceName=notary_signer_db
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:05:53 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=1m&alertIf=%40service_mem_limit_nobuff%3A0.80&alertName=memlimit&distribute=true&replicas=1&serviceName=sftp_matt-file-watcher
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:05:53 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.8&alertName=memlimit&distribute=true&port=8080&replicas=1&serviceDomain=rethink.staging-gridpl.us&serviceName=rethink_ui&usersPassEncrypted=true&usersSecret=rethink
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:05:53 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.8&alertName=memlimit&distribute=true&port=80&replicas=1&serviceName=logging_logstash
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:05:53 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.9&alertName=memlimit&distribute=true&port=9090&redirectWhenHttpProto=true&replicas=1&scrapePort=9090&serviceDomain=monitor.staging-gridpl.us&serviceName=monitor_monitor
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:05:53 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=1m&alertIf=%40service_mem_limit_nobuff%3A0.80&alertName=memlimit&distribute=true&replicas=1&serviceName=sftp_smartgridcis-file-watcher
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:05:53 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.8&alertName=memlimit&distribute=true&replicas=1&serviceName=rethink_primary
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:05:53 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.8&alertName=memlimit&distribute=true&replicas=1&serviceName=notary_server_db
proxy_swarm-listener.1.ijosfi3rvbg7@ip-172-31-28-30.ec2.internal | 2018/10/29 19:06:26 Sending service created notification to http://proxy:8080/v1/docker-flow-proxy/reconfigure?distribute=true&serviceName=logging_logspout
And here's the proxy logs from after the termination:
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | 2018/10/29 19:01:54 Sending distribution request to http://10.0.2.42:8080/v1/docker-flow-proxy/reconfigure?alertFor=30s&alertIf=%40service_mem_limit_nobuff%3A0.8&alertName=memlimit&distribute=false&port=9000&replicas=2&serviceDomain=docker.staging-gridpl.us&serviceName=nexus_haproxy&timeoutServer=1800
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | 2018/10/29 19:01:54 Reloading the proxy
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | 2018/10/29 19:01:54 Validating configuration
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | [WARNING] 301/190154 (33) : parsing [/cfg/haproxy.cfg:60] : a 'http-request' rule placed after a 'use_backend' rule will still be processed before.
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | [WARNING] 301/190154 (33) : parsing [/cfg/haproxy.cfg:68] : a 'http-request' rule placed after a 'use_backend' rule will still be processed before.
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | [WARNING] 301/190154 (33) : parsing [/cfg/haproxy.cfg:73] : a 'http-request' rule placed after a 'use_backend' rule will still be processed before.
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | [WARNING] 301/190154 (33) : parsing [/cfg/haproxy.cfg:78] : a 'http-request' rule placed after a 'use_backend' rule will still be processed before.
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | [WARNING] 301/190154 (33) : parsing [/cfg/haproxy.cfg:83] : a 'http-request' rule placed after a 'use_backend' rule will still be processed before.
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | 2018/10/29 19:01:54 Creating configuration for the service nexus_haproxy
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | [ALERT] 301/190154 (33) : parsing [/cfg/haproxy.cfg:137] : 'server denormalizer-api_prisma' : could not resolve address 'denormalizer-api_prisma'.
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | Exit Status: 1
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | 2018/10/29 19:01:54 Config validation failed. Will try again...
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | [ALERT] 301/190154 (33) : Failed to initialize server(s) addr.
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | [WARNING] 301/190155 (35) : parsing [/cfg/haproxy.cfg:60] : a 'http-request' rule placed after a 'use_backend' rule will still be processed before.
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | [WARNING] 301/190155 (35) : parsing [/cfg/haproxy.cfg:68] : a 'http-request' rule placed after a 'use_backend' rule will still be processed before.
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | [WARNING] 301/190155 (35) : parsing [/cfg/haproxy.cfg:73] : a 'http-request' rule placed after a 'use_backend' rule will still be processed before.
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | [WARNING] 301/190155 (35) : parsing [/cfg/haproxy.cfg:81] : a 'http-request' rule placed after a 'use_backend' rule will still be processed before.
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | [WARNING] 301/190155 (35) : parsing [/cfg/haproxy.cfg:89] : a 'http-request' rule placed after a 'use_backend' rule will still be processed before.
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | [WARNING] 301/190155 (35) : parsing [/cfg/haproxy.cfg:106] : a 'http-request' rule placed after a 'use_backend' rule will still be processed before.
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | 2018/10/29 19:01:55 Validating configuration
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | [ALERT] 301/190155 (35) : parsing [/cfg/haproxy.cfg:154] : 'server denormalizer-api_prisma' : could not resolve address 'denormalizer-api_prisma'.
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | [ALERT] 301/190155 (35) : parsing [/cfg/haproxy.cfg:190] : 'server jupyter_jupyter' : could not resolve address 'jupyter_jupyter'.
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | [ALERT] 301/190155 (35) : parsing [/cfg/haproxy.cfg:245] : 'server signing-api-proxy_service' : could not resolve address 'signing-api-proxy_service'.
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | Exit Status: 1
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | 2018/10/29 19:01:56 Config validation failed. Will try again...
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | 2018/10/29 19:01:57 Validating configuration
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | [WARNING] 301/190157 (43) : parsing [/cfg/haproxy.cfg:60] : a 'http-request' rule placed after a 'use_backend' rule will still be processed before.
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | [WARNING] 301/190157 (43) : parsing [/cfg/haproxy.cfg:68] : a 'http-request' rule placed after a 'use_backend' rule will still be processed before.
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | [WARNING] 301/190157 (43) : parsing [/cfg/haproxy.cfg:73] : a 'http-request' rule placed after a 'use_backend' rule will still be processed before.
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | [WARNING] 301/190157 (43) : parsing [/cfg/haproxy.cfg:81] : a 'http-request' rule placed after a 'use_backend' rule will still be processed before.
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | [WARNING] 301/190157 (43) : parsing [/cfg/haproxy.cfg:89] : a 'http-request' rule placed after a 'use_backend' rule will still be processed before.
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | [WARNING] 301/190157 (43) : parsing [/cfg/haproxy.cfg:106] : a 'http-request' rule placed after a 'use_backend' rule will still be processed before.
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | [ALERT] 301/190157 (43) : parsing [/cfg/haproxy.cfg:154] : 'server denormalizer-api_prisma' : could not resolve address 'denormalizer-api_prisma'.
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | [ALERT] 301/190157 (43) : parsing [/cfg/haproxy.cfg:190] : 'server jupyter_jupyter' : could not resolve address 'jupyter_jupyter'.
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | [ALERT] 301/190157 (43) : parsing [/cfg/haproxy.cfg:245] : 'server signing-api-proxy_service' : could not resolve address 'signing-api-proxy_service'.
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | Exit Status: 1
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | 2018/10/29 19:01:57 Config validation failed. Will try again...
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | [ALERT] 301/190157 (43) : Failed to initialize server(s) addr.
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | [WARNING] 301/190158 (44) : parsing [/cfg/haproxy.cfg:60] : a 'http-request' rule placed after a 'use_backend' rule will still be processed before.
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | [WARNING] 301/190158 (44) : parsing [/cfg/haproxy.cfg:68] : a 'http-request' rule placed after a 'use_backend' rule will still be processed before.
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | [WARNING] 301/190158 (44) : parsing [/cfg/haproxy.cfg:73] : a 'http-request' rule placed after a 'use_backend' rule will still be processed before.
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | [WARNING] 301/190158 (44) : parsing [/cfg/haproxy.cfg:81] : a 'http-request' rule placed after a 'use_backend' rule will still be processed before.
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | [WARNING] 301/190158 (44) : parsing [/cfg/haproxy.cfg:89] : a 'http-request' rule placed after a 'use_backend' rule will still be processed before.
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | [WARNING] 301/190158 (44) : parsing [/cfg/haproxy.cfg:106] : a 'http-request' rule placed after a 'use_backend' rule will still be processed before.
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | 2018/10/29 19:01:58 Validating configuration
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | [ALERT] 301/190158 (44) : parsing [/cfg/haproxy.cfg:154] : 'server denormalizer-api_prisma' : could not resolve address 'denormalizer-api_prisma'.
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | [ALERT] 301/190158 (44) : parsing [/cfg/haproxy.cfg:190] : 'server jupyter_jupyter' : could not resolve address 'jupyter_jupyter'.
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | [ALERT] 301/190158 (44) : parsing [/cfg/haproxy.cfg:245] : 'server signing-api-proxy_service' : could not resolve address 'signing-api-proxy_service'.
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | Exit Status: 1
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | 2018/10/29 19:01:59 Config validation failed. Will try again...
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | [ALERT] 301/190158 (44) : Failed to initialize server(s) addr.
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | [WARNING] 301/190200 (45) : parsing [/cfg/haproxy.cfg:60] : a 'http-request' rule placed after a 'use_backend' rule will still be processed before.
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | [WARNING] 301/190200 (45) : parsing [/cfg/haproxy.cfg:68] : a 'http-request' rule placed after a 'use_backend' rule will still be processed before.
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | [WARNING] 301/190200 (45) : parsing [/cfg/haproxy.cfg:73] : a 'http-request' rule placed after a 'use_backend' rule will still be processed before.
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | [WARNING] 301/190200 (45) : parsing [/cfg/haproxy.cfg:81] : a 'http-request' rule placed after a 'use_backend' rule will still be processed before.
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | [WARNING] 301/190200 (45) : parsing [/cfg/haproxy.cfg:89] : a 'http-request' rule placed after a 'use_backend' rule will still be processed before.
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | [WARNING] 301/190200 (45) : parsing [/cfg/haproxy.cfg:106] : a 'http-request' rule placed after a 'use_backend' rule will still be processed before.
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | 2018/10/29 19:02:00 Validating configuration
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | [ALERT] 301/190200 (45) : parsing [/cfg/haproxy.cfg:154] : 'server denormalizer-api_prisma' : could not resolve address 'denormalizer-api_prisma'.
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | [ALERT] 301/190200 (45) : parsing [/cfg/haproxy.cfg:190] : 'server jupyter_jupyter' : could not resolve address 'jupyter_jupyter'.
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | [ALERT] 301/190200 (45) : parsing [/cfg/haproxy.cfg:245] : 'server signing-api-proxy_service' : could not resolve address 'signing-api-proxy_service'.
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | [ALERT] 301/190200 (45) : Failed to initialize server(s) addr.
proxy_proxy.1.oyksbrstd5dl@ip-172-31-25-61.ec2.internal | Exit Status: 1
Also of note, we had another issue over the weekend that I feel ties into our problem with DFP here. One of my colleagues launched a service with the df labels to reconfigure the proxy, but forgot to assign the service access to the proxy network. This caused DFSL to reconfigure DFP with an entry for the new service name, but since DFP did not have network access to the container, it caused haproxy to go into the same "could not resolve address, Exit Status: 1" loop as we see here. Since most of our services go through DFP, this unfortunately causes a widespread access outage. (in staging this time, fortunately.)
My point here is that even if DFSL accurately reports the state of the swarm to DFP 100% of the time, there are still cases where DFP could be unable to reach one of its backends through no fault of its own. As long as this causes an outage for every service going through the proxy, it makes me nervous having this component in the center of our architecture, as it were.
Ultimately, I think it would vastly improve the reliability of DFP if it protected itself from these kinds of cascading failures when a backend is unreachable.
It looks like it is time to add some sort of protection in DFP to handle the services not being reachable issue.
I think adding the following would help with this:
defaults
default-server init-addr last,libc,none
This configuration would allow for unresolved hostnames to be down, but all other backends would still be up.
While looking through the source, it looks like this feature is already in DFP. It can be activated by setting CHECK_RESOLVERS
to true
.
Hi @thomasjpfan ,
If adding CHECK_RESOLVERS
to true
resolves unreachable 503 errors then is n't it is required to set it true as default?
Going through the history of CHECK_RESOLVERS
, it seems users want control over this feature: https://github.com/docker-flow/docker-flow-proxy/issues/1, https://github.com/docker-flow/docker-flow-proxy/issues/2. Changing the default now, would break backwards compatibility.
Wow, I had no idea that was an option. Sounds like it's just what we were looking for. Will give it a try and let you know if it resolves the issue here.
Just tried out my test case with CHECK_RESOLVERS=true on DFP and it worked great! Instead of hitting the reconfiguring loop the logs of DFP show this:
[WARNING] 302/193024 (354) : parsing [/cfg/haproxy.cfg:220] : 'server jupyter_jupyter' : could not resolve address 'jupyter_jupyter', disabling server.
Everything else was accessible while this happened, and once the service came up the proxy reconfigured itself.
Thanks for the tip on the feature, this is fantastic!
Hey @maevyn11 Doesnt that mean if jupyter_jupyter service didn't came up soon it would also result in a 503 error? What I mean is... doesn't there be a time when there is a 99.99 % availability of the services ?
We're automatically firing up and destroying containers on a set schedule. In order to avoid downtime due to the loop as described in this ticket, we're running all services with CHECK_RESOLVERS=true. Unfortunately this doesn't work well with the letsencrypt certificate renewal checks used by hamburml/docker-flow-letsencrypt. Looks like the proxy is not updating quick enough. Is there a way to exclude all /.well-known paths on any service from the CHECK_RESOLVERS check? I already tried adding the com.df.checkResolvers=false label to the letsencrypt service; unfortunately this doesn't work.
This project needs adoption. I moved to Kubernetes and cannot dedicate time to this project anymore. Similarly, involvement from other contributors dropped as well. Please consider contributing yourself if you think this project is useful.
Dear @rubensyltek
If this issue is still relevant, please feel free to leave a comment here.
Closed due to inactivity
Description
After a reboot of some of the worker nodes DFP starts returning 503 for all services. Even when the servicies are up and running again. DFP keeps on returning 503 until it's restarted
Steps to reproduce the issue:
Describe the results you received:
DFP keeps returning 503 even though all services are up and running
Describe the results you expected:
DFP starts routing traffic to services as soon as they get restarted
Additional information you deem important (e.g. issue happens only occasionally):
We need to restart (docker service update --force) both DFP and DFSL so they start working again. When the proxy is not responding if we sh into the container and restart just the haproxy, it starts working The HAProxy configuration looks ok although all requests return 503 (the basic black 503 message not the fancy 503 that the proxy returns when a service is not available)
Additional environment details (AWS, VirtualBox, physical, etc.):
We've been managed to reproduce this issue both in AWS and our own servers