TykTechnologies / tyk

Tyk Open Source API Gateway written in Go, supporting REST, GraphQL, TCP and gRPC protocols
Other
9.65k stars 1.08k forks source link

UptimeTest tyk proxies requests to upstream when Host is Down #2839

Closed maciejwojciechowski closed 4 years ago

maciejwojciechowski commented 4 years ago

Branch/Environment/Version

Describe the bug With Uptime Tests and this feature "Use Uptime Tests with Target URL" we should not proxy request until the host is UP again. However, this currently does not happen and Tyk tries to proxy requests when Host is DOWN.

part of the log where you can see that host is down and tyk proxies the requests

[Jan 27 10:19:04]  WARN host-check-mgr: [HOST CHECKER MANAGER] Host is DOWN: http://w2nothing.org/health
[Jan 27 10:19:31]  WARN HOST CHECKER: [HOST DOWN BUT NOT REACHED LIMIT]: http://w2nothing.org/health
[Jan 27 10:20:02]  WARN HOST CHECKER: [HOST DOWN BUT NOT REACHED LIMIT]: http://w2nothing.org/health
[Jan 27 10:20:29]  WARN HOST CHECKER: [HOST DOWN]: http://w2nothing.org/health
[Jan 27 10:20:29]  WARN host-check-mgr: [HOST CHECKER MANAGER] Host is DOWN: http://w2nothing.org/health
[Jan 27 10:20:59]  WARN HOST CHECKER: [HOST DOWN BUT NOT REACHED LIMIT]: http://w2nothing.org/health
[Jan 27 10:21:32]  WARN HOST CHECKER: [HOST DOWN BUT NOT REACHED LIMIT]: http://w2nothing.org/health
[Jan 27 10:22:01]  WARN HOST CHECKER: [HOST DOWN]: http://w2nothing.org/health
[Jan 27 10:22:01]  WARN host-check-mgr: [HOST CHECKER MANAGER] Host is DOWN: http://w2nothing.org/health
[Jan 27 10:22:31] ERROR PROXY: [LOAD BALANCING] all hosts are down, uptime tests are failing
[Jan 27 10:22:33]  WARN HOST CHECKER: [HOST DOWN BUT NOT REACHED LIMIT]: http://w2nothing.org/health
[Jan 27 10:22:34] ERROR PROXY: [LOAD BALANCING] all hosts are down, uptime tests are failing
[Jan 27 10:22:45] ERROR PROXY: [LOAD BALANCING] all hosts are down, uptime tests are failing
[Jan 27 10:22:54] ERROR PROXY: [LOAD BALANCING] all hosts are down, uptime tests are failing
[Jan 27 10:23:04]  WARN HOST CHECKER: [HOST DOWN BUT NOT REACHED LIMIT]: http://w2nothing.org/health
[Jan 27 10:23:21] ERROR proxy: http: proxy error: dial tcp: lookup w2nothing.org: no such host api_id=178584466495484f46b37d835c220d82 api_name=api mw=ReverseProxy org_id=5e2e9e28fe238f084b91ea38 server_name=w2nothing.org user_id=****ef9d user_ip=127.0.0.1 user_name=
[Jan 27 10:23:33]  WARN HOST CHECKER: [HOST DOWN]: http://w2nothing.org/health
[Jan 27 10:23:33]  WARN host-check-mgr: [HOST CHECKER MANAGER] Host is DOWN: http://w2nothing.org/health
[Jan 27 10:23:38] ERROR PROXY: [LOAD BALANCING] all hosts are down, uptime tests are failing
[Jan 27 10:24:00]  WARN HOST CHECKER: [HOST DOWN BUT NOT REACHED LIMIT]: http://w2nothing.org/health
[Jan 27 10:24:03] ERROR PROXY: [LOAD BALANCING] all hosts are down, uptime tests are failing
[Jan 27 10:24:30]  WARN HOST CHECKER: [HOST DOWN BUT NOT REACHED LIMIT]: http://w2nothing.org/health
[Jan 27 10:24:34] ERROR proxy: http: proxy error: dial tcp: lookup w2nothing.org: no such host api_id=178584466495484f46b37d835c220d82 api_name=api mw=ReverseProxy org_id=5e2e9e28fe238f084b91ea38 server_name=w2nothing.org user_id=****ef9d user_ip=127.0.0.1 user_name=
[Jan 27 10:24:59]  WARN HOST CHECKER: [HOST DOWN]: http://w2nothing.org/health
[Jan 27 10:24:59]  WARN host-check-mgr: [HOST CHECKER MANAGER] Host is DOWN: http://w2nothing.org/health
[Jan 27 10:25:27]  WARN HOST CHECKER: [HOST DOWN BUT NOT REACHED LIMIT]: http://w2nothing.org/health
[Jan 27 10:25:54]  WARN HOST CHECKER: [HOST DOWN]: http://w2nothing.org/health

you can see that some requests are correctly rejected with [LOAD BALANCING] all hosts are down but some actually are proxied through proxy: http: proxy error: dial tcp: lookup

Reproduction steps Steps to reproduce the behavior:

  1. set this in tyk.conf
    "uptime_tests": {
        "disable": false,
        "config": {
            "failure_trigger_sample_size": 2,
            "time_wait": 30,
            "checker_pool_size": 5,
            "enable_uptime_analytics": true
        }
    },
  2. create api with those features
    • Enable round-robin load balancing
    • Use Uptime Tests with Target URL
    • Use Uptime Tests with Target URL Check URLs

sample_api.txt

  1. bring down the target host

  2. wait for Tyk to send event Host Down

  3. send some traffic to API

    • first time you get proper response all hosts are down
  4. Observe logs, at some point Uptime Tests show this in log [HOST DOWN BUT NOT REACHED LIMIT]

  5. send traffic to API again

Actual behavior Tyk tries to proxy traffic to upstream

Expected behavior Tyk should not proxy traffic until host is marked as UP in uptime tests.

maciejwojciechowski commented 4 years ago

verified