Flawed test methodology: Separating test client from the server on GKE

mw866 commented 4 years ago

The test method is flawed. In the architecture, the test clients and servers reside in the same GKE cluster. It creates an unrealistic environment with every low latency and thus high throughput.

Explanation of the relation between latency and throughput: https://hpbn.co/primer-on-latency-and-bandwidth/#delivering-higher-bandwidth-and-lower-latencies

To compare the architecture;

when testing Argo Tunnel: GKE (client) -> Cloudflare-> GKE (envoy) Higher latency -> Lower throughput
when testing GKE LB: GKE (client) -> GKE ingress -> GKE (envoy) Lower Latency -> Higher throughput

Recommendation: Perform the test from the client from (ideally) residential ISPs

when testing Argo Tunnel: Eyeball ISP (client) -> cCloudflare ->GKE (envoy)
when testing GKE LB: Eyeball (client) -> GKE ingress -> GKE (envoy)

To prove my point, here are my test results based on my replicated GKE environment in asia-east-1. The client is my laptop in Singapore.

Step 1: (Optional) Stop client deployment on GKE

k get pod
kubectl scale deploy backend --replicas=3
kubectl scale deploy client --replicas=0

Step 2: (Optional) Check Pods

k get pod --output wide 
NAME                       READY   STATUS    RESTARTS   AGE     IP           NODE                                        NOMINATED NODE   READINESS GATES
backend-56976bd47f-pb8k8   1/1     Running   0          22m     10.112.2.4   gke-cwang-argo-test-backend-d57ba9ad-ssjx   <none>           <none>
backend-56976bd47f-vcw4h   1/1     Running   0          3h46m   10.112.2.2   gke-cwang-argo-test-backend-d57ba9ad-ssjx   <none>           <none>
backend-56976bd47f-z8mxp   1/1     Running   0          22m     10.112.2.3   gke-cwang-argo-test-backend-d57ba9ad-ssjx   <none>           <none>
envoy-7599c9bc7c-vq6zn     1/1     Running   0          3h46m   10.112.1.2   gke-cwang-argo-test-egress-56d9ee91-z5c9    <none>           <none>
grafana-0                  1/1     Running   0          3h46m   10.112.4.3   gke-cwang-argo-test-metric-6d655417-02sf    <none>           <none>
prometheus-0               1/1     Running   0          3h46m   10.112.4.2   gke-cwang-argo-test-metric-6d655417-02sf    <none>           <none>
tunnel-0                   1/1     Running   0          88m     10.112.3.3   gke-cwang-argo-test-tunnel-225def6a-5h19    <none>           <none>

Step 3: Benchmarking

Step 3a: GCP LB/Ingress without Cloudflare

ab -r -n 10000 -c 5 http://34.102.144.64/healthz

This is ApacheBench, Version 2.3 <$Revision: 1826891 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 34.102.144.64 (be patient)
Completed 1000 requests
Completed 2000 requests
Completed 3000 requests
Completed 4000 requests
Completed 5000 requests
Completed 6000 requests
Completed 7000 requests
Completed 8000 requests
Completed 9000 requests
Completed 10000 requests
Finished 10000 requests

Server Software:        envoy
Server Hostname:        34.102.144.64
Server Port:            80

Document Path:          /healthz
Document Length:        0 bytes

Concurrency Level:      5
Time taken for tests:   144.486 seconds
Complete requests:      10000
Failed requests:        0
Total transferred:      1410000 bytes
HTML transferred:       0 bytes
Requests per second:    69.21 [#/sec] (mean)
Time per request:       72.243 [ms] (mean)
Time per request:       14.449 [ms] (mean, across all concurrent requests)
Transfer rate:          9.53 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        2    3   1.7      2     114
Processing:    50   69 183.2     51    3495
Waiting:       50   68 183.0     51    3495
Total:         52   71 183.2     54    3497

Percentage of the requests served within a certain time (ms)
  50%     54
  66%     54
  75%     54
  80%     55
  90%     55
  95%     56
  98%     61
  99%    617
 100%   3497 (longest request)

Step 3b: Benchmark the GCP LB + With Cloudflare

ab -r  -n 10000 -c 5 http://dcard-gke-ingress.chriswang.me/healthz

This is ApacheBench, Version 2.3 <$Revision: 1826891 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking dcard-gke-ingress.chriswang.me (be patient)
Completed 1000 requests
Completed 2000 requests
Completed 3000 requests
Completed 4000 requests
Completed 5000 requests
Completed 6000 requests
Completed 7000 requests
Completed 8000 requests
Completed 9000 requests
Completed 10000 requests
Finished 10000 requests

Server Software:        cloudflare
Server Hostname:        dcard-gke-ingress.chriswang.me
Server Port:            80

Document Path:          /healthz
Document Length:        0 bytes

Concurrency Level:      5
Time taken for tests:   341.650 seconds
Complete requests:      10000
Failed requests:        5
   (Connect: 0, Receive: 5, Length: 0, Exceptions: 0)
Total transferred:      3968015 bytes
HTML transferred:       0 bytes
Requests per second:    29.27 [#/sec] (mean)
Time per request:       170.825 [ms] (mean)
Time per request:       34.165 [ms] (mean, across all concurrent requests)
Transfer rate:          11.34 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        1   96 1159.0      2   36846
Processing:    56   74 118.9     64    3389
Waiting:        0   72  94.1     64    2584
Total:         58  169 1166.1     67   36911

Percentage of the requests served within a certain time (ms)
  50%     67
  66%     69
  75%     71
  80%     73
  90%     79
  95%     90
  98%   1143
  99%   2176
 100%  36911 (longest request)

3c: Benchmark Cloudflare with Argo Tunnel

ab -r -n 10000 -c 5 http://tunnel-0.chriswang.me/healthz

This is ApacheBench, Version 2.3 <$Revision: 1826891 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking tunnel-0.chriswang.me (be patient)
Completed 1000 requests
Completed 2000 requests
Completed 3000 requests
Completed 4000 requests
Completed 5000 requests
Completed 6000 requests
Completed 7000 requests
Completed 8000 requests
Completed 9000 requests
Completed 10000 requests
Finished 10000 requests

Server Software:        cloudflare
Server Hostname:        tunnel-0.chriswang.me
Server Port:            80

Document Path:          /healthz
Document Length:        0 bytes

Concurrency Level:      5
Time taken for tests:   332.904 seconds
Complete requests:      10000
Failed requests:        3
   (Connect: 0, Receive: 3, Length: 0, Exceptions: 0)
Total transferred:      3996662 bytes
HTML transferred:       0 bytes
Requests per second:    30.04 [#/sec] (mean)
Time per request:       166.452 [ms] (mean)
Time per request:       33.290 [ms] (mean, across all concurrent requests)
Transfer rate:          11.72 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        1   90 1043.2      2   36753
Processing:    54   75 115.1     64    3355
Waiting:        0   74 100.6     64    2219
Total:         56  165 1049.0     66   36813

Percentage of the requests served within a certain time (ms)
  50%     66
  66%     69
  75%     71
  80%     73
  90%     82
  95%    159
  98%   1136
  99%   2169
 100%  36813 (longest request)

rueian commented 4 years ago

Hi @mw866, thank you for your feedback and sharing which are all very helpful.

Yes, I know that clients and servers reside in the same GKE cluster or both reside in the GCP network are unrealistic. The reason I do so is to make sure there is no unacceptable performance issue inside my GKE cluster.

And now I can achieve a promising 3K/s throughput by applying a small fix to cloudflared which is mentioned in the PR https://github.com/cloudflare/cloudflared/pull/155.

However, comparing to the clients -> cloudflare -> GCP Loadbalancer/GKE ingress setup, the argo tunnel setup still has much larger latency (6x ms vs 4x ms) and lower throughput (3K/s vs 5K/s).

The argo tunnel setup:

The clients -> cloudflare -> GCP Loadbalancer/ GKE ingress setup:

How can I do further investigation to improve the result in your opinion?

Thank you.

mw866 commented 4 years ago

Good to know the PR has helped to narrow the performance gap. To improve the tunnel performance further, let's collect more data for analysis.

Data source 1: The `cloudflared` logs

Below is an example from my environment in GCP's asia-east-1. 2x tunnels to Cloudflare Hong Kong + 2x tunnels to Cloudflare Taipei This is in fact expected. By design, the tunnels are spread to 2x nearest POPs for high availability.

Note the metrics server is started on port 8888.

k logs tunnel-0
time="2019-12-09T05:46:51Z" level=warning msg="Cannot determine default configuration path. No file [config.yml config.yaml] in [~/.cloudflared ~/.cloudflare-warp ~/cloudflare-warp /usr/local/etc/cloudflared /etc/cloudflared]"
time="2019-12-09T05:46:51Z" level=info msg="Version 2019.11.3"
time="2019-12-09T05:46:51Z" level=info msg="GOOS: linux, GOVersion: go1.12.7, GoArch: amd64"
time="2019-12-09T05:46:51Z" level=info msg=Flags hostname=tunnel-0.chriswang.me metrics="0.0.0.0:8888" no-autoupdate=true no-tls-verify=true origincert=/etc/cloudflared/cert.pem proxy-dns-upstream="https://1.1.1.1/dns-query, https://1.0.0.1/dns-query" proxy-keepalive-connections=1000 url="https://envoy"
time="2019-12-09T05:46:51Z" level=info msg="Starting metrics server" addr="[::]:8888"
time="2019-12-09T05:46:51Z" level=info msg="Proxying tunnel requests to https://envoy"
time="2019-12-09T05:46:54Z" level=info msg="Connected to HKG" connectionID=0
time="2019-12-09T05:46:54Z" level=info msg="Each HA connection's tunnel IDs: map[0:efg74ar4601k9lz3fz1ar5q9cyerdr4i4sf2942gs0ffq38an1ug]" connectionID=0
time="2019-12-09T05:46:54Z" level=info msg="Route propagating, it may take up to 1 minute for your new route to become functional" connectionID=0
time="2019-12-09T05:46:56Z" level=info msg="Connected to TPE" connectionID=1
time="2019-12-09T05:46:56Z" level=info msg="Each HA connection's tunnel IDs: map[0:efg74ar4601k9lz3fz1ar5q9cyerdr4i4sf2942gs0ffq38an1ug 1:efg74ar4601k9lz3fz1ar5q9cyerdr4i4sf2942gs0ffq38an1ug]" connectionID=1
time="2019-12-09T05:46:56Z" level=info msg="Route propagating, it may take up to 1 minute for your new route to become functional" connectionID=1
time="2019-12-09T05:46:57Z" level=info msg="Connected to HKG" connectionID=2
time="2019-12-09T05:46:57Z" level=info msg="Each HA connection's tunnel IDs: map[0:efg74ar4601k9lz3fz1ar5q9cyerdr4i4sf2942gs0ffq38an1ug 1:efg74ar4601k9lz3fz1ar5q9cyerdr4i4sf2942gs0ffq38an1ug 2:efg74ar4601k9lz3fz1ar5q9cyerdr4i4sf2942gs0ffq38an1ug]" connectionID=2
time="2019-12-09T05:46:57Z" level=info msg="Route propagating, it may take up to 1 minute for your new route to become functional" connectionID=2
time="2019-12-09T05:47:00Z" level=info msg="Connected to TPE" connectionID=3
time="2019-12-09T05:47:00Z" level=info msg="Each HA connection's tunnel IDs: map[0:efg74ar4601k9lz3fz1ar5q9cyerdr4i4sf2942gs0ffq38an1ug 1:efg74ar4601k9lz3fz1ar5q9cyerdr4i4sf2942gs0ffq38an1ug 2:efg74ar4601k9lz3fz1ar5q9cyerdr4i4sf2942gs0ffq38an1ug 3:efg74ar4601k9lz3fz1ar5q9cyerdr4i4sf2942gs0ffq38an1ug]" connectionID=3
time="2019-12-09T05:47:00Z" level=info msg="Route propagating, it may take up to 1 minute for your new route to become functional" connectionID=3
time="2019-12-09T18:34:21Z" level=warning msg="frame read error" dir=read error="read tcp 10.112.3.3:58818->198.41.192.227:7844: read: connection reset by peer" subsystem=mux
time="2019-12-09T18:34:21Z" level=error msg="Serve tunnel error" connectionID=0 error="read tcp 10.112.3.3:58818->198.41.192.227:7844: read: connection reset by peer"
time="2019-12-09T18:34:21Z" level=info msg="Retrying in 1s seconds" connectionID=0
time="2019-12-09T18:34:22Z" level=warning msg="frame read error" dir=read error="read tcp 10.112.3.3:35256->198.41.192.7:7844: read: connection reset by peer" subsystem=mux
time="2019-12-09T18:34:22Z" level=error msg="Serve tunnel error" connectionID=2 error="read tcp 10.112.3.3:35256->198.41.192.7:7844: read: connection reset by peer"
time="2019-12-09T18:34:22Z" level=info msg="Retrying in 1s seconds" connectionID=2
time="2019-12-09T18:34:40Z" level=info msg="Connected to NRT" connectionID=0
time="2019-12-09T18:34:40Z" level=info msg="Each HA connection's tunnel IDs: map[0:efg74ar4601k9lz3fz1ar5q9cyerdr4i4sf2942gs0ffq38an1ug 1:efg74ar4601k9lz3fz1ar5q9cyerdr4i4sf2942gs0ffq38an1ug 2:efg74ar4601k9lz3fz1ar5q9cyerdr4i4sf2942gs0ffq38an1ug 3:efg74ar4601k9lz3fz1ar5q9cyerdr4i4sf2942gs0ffq38an1ug]" connectionID=0
time="2019-12-09T18:34:40Z" level=info msg="Route propagating, it may take up to 1 minute for your new route to become functional" connectionID=0
time="2019-12-09T18:37:51Z" level=info msg="Connected to NRT" connectionID=2
time="2019-12-09T18:37:51Z" level=info msg="Each HA connection's tunnel IDs: map[0:efg74ar4601k9lz3fz1ar5q9cyerdr4i4sf2942gs0ffq38an1ug 1:efg74ar4601k9lz3fz1ar5q9cyerdr4i4sf2942gs0ffq38an1ug 2:efg74ar4601k9lz3fz1ar5q9cyerdr4i4sf2942gs0ffq38an1ug 3:efg74ar4601k9lz3fz1ar5q9cyerdr4i4sf2942gs0ffq38an1ug]" connectionID=2
time="2019-12-09T18:37:51Z" level=info msg="Route propagating, it may take up to 1 minute for your new route to become functional" connectionID=2
time="2019-12-09T23:14:31Z" level=warning msg="frame read error" dir=read error="read tcp 10.112.3.3:56124->198.41.192.7:7844: read: connection reset by peer" subsystem=mux
time="2019-12-09T23:14:31Z" level=error msg="Serve tunnel error" connectionID=2 error="read tcp 10.112.3.3:56124->198.41.192.7:7844: read: connection reset by peer"
time="2019-12-09T23:14:31Z" level=info msg="Retrying in 1s seconds" connectionID=2
time="2019-12-09T23:14:32Z" level=warning msg="frame read error" dir=read error="read tcp 10.112.3.3:51454->198.41.192.227:7844: read: connection reset by peer" subsystem=mux
time="2019-12-09T23:14:32Z" level=error msg="Serve tunnel error" connectionID=0 error="read tcp 10.112.3.3:51454->198.41.192.227:7844: read: connection reset by peer"
time="2019-12-09T23:14:32Z" level=info msg="Retrying in 1s seconds" connectionID=0
time="2019-12-09T23:16:08Z" level=info msg="Connected to HKG" connectionID=2
time="2019-12-09T23:16:08Z" level=info msg="Each HA connection's tunnel IDs: map[0:efg74ar4601k9lz3fz1ar5q9cyerdr4i4sf2942gs0ffq38an1ug 1:efg74ar4601k9lz3fz1ar5q9cyerdr4i4sf2942gs0ffq38an1ug 2:efg74ar4601k9lz3fz1ar5q9cyerdr4i4sf2942gs0ffq38an1ug 3:efg74ar4601k9lz3fz1ar5q9cyerdr4i4sf2942gs0ffq38an1ug]" connectionID=2
time="2019-12-09T23:16:08Z" level=info msg="Route propagating, it may take up to 1 minute for your new route to become functional" connectionID=2
time="2019-12-09T23:16:08Z" level=info msg="Connected to HKG" connectionID=0
time="2019-12-09T23:16:08Z" level=info msg="Each HA connection's tunnel IDs: map[0:efg74ar4601k9lz3fz1ar5q9cyerdr4i4sf2942gs0ffq38an1ug 1:efg74ar4601k9lz3fz1ar5q9cyerdr4i4sf2942gs0ffq38an1ug 2:efg74ar4601k9lz3fz1ar5q9cyerdr4i4sf2942gs0ffq38an1ug 3:efg74ar4601k9lz3fz1ar5q9cyerdr4i4sf2942gs0ffq38an1ug]" connectionID=0
time="2019-12-09T23:16:08Z" level=info msg="Route propagating, it may take up to 1 minute for your new route to become functional" connectionID=0

Data source 2: `cloudflared`'s local metrics server

Metrics from cloudflared daemon can be exposed via its local metrics server.

Example metrics from my environment.

$ k port-forward tunnel-0 8888
Forwarding from 127.0.0.1:8888 -> 8888
Forwarding from [::1]:8888 -> 8888
Handling connection for 8888

$ curl http://localhost:8888/metrics
# HELP build_info Build and version information
# TYPE build_info gauge
build_info{goversion="go1.12.7",revision="2019-11-20-2249 UTC",version="2019.11.3"} 1
# HELP cloudflared_muxer_comp_bytes_after Bytes sent via cross-stream compression, post compression
# TYPE cloudflared_muxer_comp_bytes_after gauge
cloudflared_muxer_comp_bytes_after{connection_id="0"} 0
cloudflared_muxer_comp_bytes_after{connection_id="1"} 0
cloudflared_muxer_comp_bytes_after{connection_id="2"} 0
cloudflared_muxer_comp_bytes_after{connection_id="3"} 0
# HELP cloudflared_muxer_comp_bytes_before Bytes sent via cross-stream compression, pre compression
# TYPE cloudflared_muxer_comp_bytes_before gauge
cloudflared_muxer_comp_bytes_before{connection_id="0"} 0
cloudflared_muxer_comp_bytes_before{connection_id="1"} 0
cloudflared_muxer_comp_bytes_before{connection_id="2"} 0
cloudflared_muxer_comp_bytes_before{connection_id="3"} 0
# HELP cloudflared_muxer_comp_rate_ave Average outbound cross-stream compression ratio
# TYPE cloudflared_muxer_comp_rate_ave gauge
cloudflared_muxer_comp_rate_ave{connection_id="0"} 1
cloudflared_muxer_comp_rate_ave{connection_id="1"} 1
cloudflared_muxer_comp_rate_ave{connection_id="2"} 1
cloudflared_muxer_comp_rate_ave{connection_id="3"} 1
# HELP cloudflared_muxer_inbound_bytes_per_sec_curr Current inbounding bytes per second, 0 if there is no incoming connection
# TYPE cloudflared_muxer_inbound_bytes_per_sec_curr gauge
cloudflared_muxer_inbound_bytes_per_sec_curr{connection_id="0"} 0
cloudflared_muxer_inbound_bytes_per_sec_curr{connection_id="1"} 0
cloudflared_muxer_inbound_bytes_per_sec_curr{connection_id="2"} 0
cloudflared_muxer_inbound_bytes_per_sec_curr{connection_id="3"} 0
# HELP cloudflared_muxer_inbound_bytes_per_sec_max Maximum inbounding bytes per second
# TYPE cloudflared_muxer_inbound_bytes_per_sec_max gauge
cloudflared_muxer_inbound_bytes_per_sec_max{connection_id="0"} 256
cloudflared_muxer_inbound_bytes_per_sec_max{connection_id="1"} 256
cloudflared_muxer_inbound_bytes_per_sec_max{connection_id="2"} 256
cloudflared_muxer_inbound_bytes_per_sec_max{connection_id="3"} 256
# HELP cloudflared_muxer_inbound_bytes_per_sec_min Minimum non-zero inbounding bytes per second
# TYPE cloudflared_muxer_inbound_bytes_per_sec_min gauge
cloudflared_muxer_inbound_bytes_per_sec_min{connection_id="0"} 200
cloudflared_muxer_inbound_bytes_per_sec_min{connection_id="1"} 200
cloudflared_muxer_inbound_bytes_per_sec_min{connection_id="2"} 200
cloudflared_muxer_inbound_bytes_per_sec_min{connection_id="3"} 200
# HELP cloudflared_muxer_outbound_bytes_per_sec_curr Current outbounding bytes per second, 0 if there is no outgoing traffic
# TYPE cloudflared_muxer_outbound_bytes_per_sec_curr gauge
cloudflared_muxer_outbound_bytes_per_sec_curr{connection_id="0"} 0
cloudflared_muxer_outbound_bytes_per_sec_curr{connection_id="1"} 0
cloudflared_muxer_outbound_bytes_per_sec_curr{connection_id="2"} 0
cloudflared_muxer_outbound_bytes_per_sec_curr{connection_id="3"} 0
# HELP cloudflared_muxer_outbound_bytes_per_sec_max Maximum outbounding bytes per second
# TYPE cloudflared_muxer_outbound_bytes_per_sec_max gauge
cloudflared_muxer_outbound_bytes_per_sec_max{connection_id="0"} 2504
cloudflared_muxer_outbound_bytes_per_sec_max{connection_id="1"} 2504
cloudflared_muxer_outbound_bytes_per_sec_max{connection_id="2"} 2504
cloudflared_muxer_outbound_bytes_per_sec_max{connection_id="3"} 2504
# HELP cloudflared_muxer_outbound_bytes_per_sec_min Minimum non-zero outbounding bytes per second
# TYPE cloudflared_muxer_outbound_bytes_per_sec_min gauge
cloudflared_muxer_outbound_bytes_per_sec_min{connection_id="0"} 40
cloudflared_muxer_outbound_bytes_per_sec_min{connection_id="1"} 104
cloudflared_muxer_outbound_bytes_per_sec_min{connection_id="2"} 40
cloudflared_muxer_outbound_bytes_per_sec_min{connection_id="3"} 104
# HELP cloudflared_muxer_receive_window_ave Average receive window size in bytes
# TYPE cloudflared_muxer_receive_window_ave gauge
cloudflared_muxer_receive_window_ave{connection_id="0"} 65516.77777777778
cloudflared_muxer_receive_window_ave{connection_id="1"} 65535
cloudflared_muxer_receive_window_ave{connection_id="2"} 65518.71428571428
cloudflared_muxer_receive_window_ave{connection_id="3"} 65535
# HELP cloudflared_muxer_receive_window_max Largest receive window size in bytes
# TYPE cloudflared_muxer_receive_window_max gauge
cloudflared_muxer_receive_window_max{connection_id="0"} 65535
cloudflared_muxer_receive_window_max{connection_id="1"} 65535
cloudflared_muxer_receive_window_max{connection_id="2"} 65535
cloudflared_muxer_receive_window_max{connection_id="3"} 65535
# HELP cloudflared_muxer_receive_window_min Smallest receive window size in bytes
# TYPE cloudflared_muxer_receive_window_min gauge
cloudflared_muxer_receive_window_min{connection_id="0"} 65279
cloudflared_muxer_receive_window_min{connection_id="1"} 65279
cloudflared_muxer_receive_window_min{connection_id="2"} 65279
cloudflared_muxer_receive_window_min{connection_id="3"} 65279
# HELP cloudflared_muxer_rtt Round-trip time in millisecond
# TYPE cloudflared_muxer_rtt gauge
cloudflared_muxer_rtt{connection_id="0"} 13
cloudflared_muxer_rtt{connection_id="1"} 3
cloudflared_muxer_rtt{connection_id="2"} 12
cloudflared_muxer_rtt{connection_id="3"} 3
# HELP cloudflared_muxer_rtt_max Longest round-trip time in millisecond
# TYPE cloudflared_muxer_rtt_max gauge
cloudflared_muxer_rtt_max{connection_id="0"} 13
cloudflared_muxer_rtt_max{connection_id="1"} 6
cloudflared_muxer_rtt_max{connection_id="2"} 13
cloudflared_muxer_rtt_max{connection_id="3"} 7
# HELP cloudflared_muxer_rtt_min Shortest round-trip time in millisecond
# TYPE cloudflared_muxer_rtt_min gauge
cloudflared_muxer_rtt_min{connection_id="0"} 12
cloudflared_muxer_rtt_min{connection_id="1"} 3
cloudflared_muxer_rtt_min{connection_id="2"} 12
cloudflared_muxer_rtt_min{connection_id="3"} 3
# HELP cloudflared_muxer_send_window_ave Average send window size in bytes
# TYPE cloudflared_muxer_send_window_ave gauge
cloudflared_muxer_send_window_ave{connection_id="0"} 64928.71428571428
cloudflared_muxer_send_window_ave{connection_id="1"} 65535
cloudflared_muxer_send_window_ave{connection_id="2"} 64845.47619047619
cloudflared_muxer_send_window_ave{connection_id="3"} 65533.96
# HELP cloudflared_muxer_send_window_max Largest send window size in bytes
# TYPE cloudflared_muxer_send_window_max gauge
cloudflared_muxer_send_window_max{connection_id="0"} 65535
cloudflared_muxer_send_window_max{connection_id="1"} 65535
cloudflared_muxer_send_window_max{connection_id="2"} 65535
cloudflared_muxer_send_window_max{connection_id="3"} 65535
# HELP cloudflared_muxer_send_window_min Smallest send window size in bytes
# TYPE cloudflared_muxer_send_window_min gauge
cloudflared_muxer_send_window_min{connection_id="0"} 63111
cloudflared_muxer_send_window_min{connection_id="1"} 63111
cloudflared_muxer_send_window_min{connection_id="2"} 63111
cloudflared_muxer_send_window_min{connection_id="3"} 63111
# HELP cloudflared_tunnel_active_streams Number of active streams created by all muxers.
# TYPE cloudflared_tunnel_active_streams gauge
cloudflared_tunnel_active_streams 0
# HELP cloudflared_tunnel_concurrent_requests_per_tunnel Concurrent requests proxied through each tunnel
# TYPE cloudflared_tunnel_concurrent_requests_per_tunnel gauge
cloudflared_tunnel_concurrent_requests_per_tunnel{connection_id="1"} 0
cloudflared_tunnel_concurrent_requests_per_tunnel{connection_id="2"} 0
cloudflared_tunnel_concurrent_requests_per_tunnel{connection_id="3"} 0
# HELP cloudflared_tunnel_ha_connections Number of active ha connections
# TYPE cloudflared_tunnel_ha_connections gauge
cloudflared_tunnel_ha_connections 4
# HELP cloudflared_tunnel_max_concurrent_requests_per_tunnel Largest number of concurrent requests proxied through each tunnel so far
# TYPE cloudflared_tunnel_max_concurrent_requests_per_tunnel gauge
cloudflared_tunnel_max_concurrent_requests_per_tunnel{connection_id="1"} 1
cloudflared_tunnel_max_concurrent_requests_per_tunnel{connection_id="2"} 3
cloudflared_tunnel_max_concurrent_requests_per_tunnel{connection_id="3"} 1
# HELP cloudflared_tunnel_requests_per_tunnel Amount of requests proxied through each tunnel
# TYPE cloudflared_tunnel_requests_per_tunnel counter
cloudflared_tunnel_requests_per_tunnel{connection_id="1"} 44
cloudflared_tunnel_requests_per_tunnel{connection_id="2"} 11611
cloudflared_tunnel_requests_per_tunnel{connection_id="3"} 44
# HELP cloudflared_tunnel_response_by_code Count of responses by HTTP status code
# TYPE cloudflared_tunnel_response_by_code counter
cloudflared_tunnel_response_by_code{status_code="200"} 11699
# HELP cloudflared_tunnel_response_code_per_tunnel Count of responses by HTTP status code fore each tunnel
# TYPE cloudflared_tunnel_response_code_per_tunnel counter
cloudflared_tunnel_response_code_per_tunnel{connection_id="1",status_code="200"} 44
cloudflared_tunnel_response_code_per_tunnel{connection_id="2",status_code="200"} 11611
cloudflared_tunnel_response_code_per_tunnel{connection_id="3",status_code="200"} 44
# HELP cloudflared_tunnel_server_locations Where each tunnel is connected to. 1 means current location, 0 means previous locations.
# TYPE cloudflared_tunnel_server_locations gauge
cloudflared_tunnel_server_locations{connection_id="0",location="HKG"} 1
cloudflared_tunnel_server_locations{connection_id="0",location="NRT"} 0
cloudflared_tunnel_server_locations{connection_id="1",location="TPE"} 1
cloudflared_tunnel_server_locations{connection_id="2",location="HKG"} 1
cloudflared_tunnel_server_locations{connection_id="2",location="NRT"} 0
cloudflared_tunnel_server_locations{connection_id="3",location="TPE"} 1
# HELP cloudflared_tunnel_timer_retries Unacknowledged heart beats count
# TYPE cloudflared_tunnel_timer_retries gauge
cloudflared_tunnel_timer_retries 0
# HELP cloudflared_tunnel_total_requests Amount of requests proxied through all the tunnels
# TYPE cloudflared_tunnel_total_requests counter
cloudflared_tunnel_total_requests 11699
# HELP cloudflared_tunnel_tunnel_register_fail Count of tunnel registration errors by type
# TYPE cloudflared_tunnel_tunnel_register_fail counter
cloudflared_tunnel_tunnel_register_fail{error="read tcp 10.112.3.3:35256->198.41.192.7:7844: read: connection reset by peer"} 1
cloudflared_tunnel_tunnel_register_fail{error="read tcp 10.112.3.3:51454->198.41.192.227:7844: read: connection reset by peer"} 1
cloudflared_tunnel_tunnel_register_fail{error="read tcp 10.112.3.3:56124->198.41.192.7:7844: read: connection reset by peer"} 1
cloudflared_tunnel_tunnel_register_fail{error="read tcp 10.112.3.3:58818->198.41.192.227:7844: read: connection reset by peer"} 1
# HELP cloudflared_tunnel_tunnel_register_success Count of successful tunnel registrations
# TYPE cloudflared_tunnel_tunnel_register_success counter
cloudflared_tunnel_tunnel_register_success 8
# HELP cloudflared_tunnel_user_hostnames_counts Which user hostnames cloudflared is serving
# TYPE cloudflared_tunnel_user_hostnames_counts counter
cloudflared_tunnel_user_hostnames_counts{userHostname="https://tunnel-0.chriswang.me"} 8
# HELP go_gc_duration_seconds A summary of the GC invocation durations.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 1.0032e-05
go_gc_duration_seconds{quantile="0.25"} 1.5198e-05
go_gc_duration_seconds{quantile="0.5"} 3.5357e-05
go_gc_duration_seconds{quantile="0.75"} 5.3302e-05
go_gc_duration_seconds{quantile="1"} 0.004540342
go_gc_duration_seconds_sum 0.208643521
go_gc_duration_seconds_count 2890
# HELP go_goroutines Number of goroutines that currently exist.
# TYPE go_goroutines gauge
go_goroutines 51
# HELP go_info Information about the Go environment.
# TYPE go_info gauge
go_info{version="go1.12.7"} 1
# HELP go_memstats_alloc_bytes Number of bytes allocated and still in use.
# TYPE go_memstats_alloc_bytes gauge
go_memstats_alloc_bytes 6.92932e+06
# HELP go_memstats_alloc_bytes_total Total number of bytes allocated, even if freed.
# TYPE go_memstats_alloc_bytes_total counter
go_memstats_alloc_bytes_total 1.364542388e+10
# HELP go_memstats_buck_hash_sys_bytes Number of bytes used by the profiling bucket hash table.
# TYPE go_memstats_buck_hash_sys_bytes gauge
go_memstats_buck_hash_sys_bytes 1.511543e+06
# HELP go_memstats_frees_total Total number of frees.
# TYPE go_memstats_frees_total counter
go_memstats_frees_total 1.2534642e+07
# HELP go_memstats_gc_cpu_fraction The fraction of this program's available CPU time used by the GC since the program started.
# TYPE go_memstats_gc_cpu_fraction gauge
go_memstats_gc_cpu_fraction 3.0319631906863982e-05
# HELP go_memstats_gc_sys_bytes Number of bytes used for garbage collection system metadata.
# TYPE go_memstats_gc_sys_bytes gauge
go_memstats_gc_sys_bytes 2.390016e+06
# HELP go_memstats_heap_alloc_bytes Number of heap bytes allocated and still in use.
# TYPE go_memstats_heap_alloc_bytes gauge
go_memstats_heap_alloc_bytes 6.92932e+06
# HELP go_memstats_heap_idle_bytes Number of heap bytes waiting to be used.
# TYPE go_memstats_heap_idle_bytes gauge
go_memstats_heap_idle_bytes 5.6705024e+07
# HELP go_memstats_heap_inuse_bytes Number of heap bytes that are in use.
# TYPE go_memstats_heap_inuse_bytes gauge
go_memstats_heap_inuse_bytes 9.158656e+06
# HELP go_memstats_heap_objects Number of allocated objects.
# TYPE go_memstats_heap_objects gauge
go_memstats_heap_objects 31079
# HELP go_memstats_heap_released_bytes Number of heap bytes released to OS.
# TYPE go_memstats_heap_released_bytes gauge
go_memstats_heap_released_bytes 5.1478528e+07
# HELP go_memstats_heap_sys_bytes Number of heap bytes obtained from system.
# TYPE go_memstats_heap_sys_bytes gauge
go_memstats_heap_sys_bytes 6.586368e+07
# HELP go_memstats_last_gc_time_seconds Number of seconds since 1970 of last garbage collection.
# TYPE go_memstats_last_gc_time_seconds gauge
go_memstats_last_gc_time_seconds 1.5759441446910505e+09
# HELP go_memstats_lookups_total Total number of pointer lookups.
# TYPE go_memstats_lookups_total counter
go_memstats_lookups_total 0
# HELP go_memstats_mallocs_total Total number of mallocs.
# TYPE go_memstats_mallocs_total counter
go_memstats_mallocs_total 1.2565721e+07
# HELP go_memstats_mcache_inuse_bytes Number of bytes in use by mcache structures.
# TYPE go_memstats_mcache_inuse_bytes gauge
go_memstats_mcache_inuse_bytes 6944
# HELP go_memstats_mcache_sys_bytes Number of bytes used for mcache structures obtained from system.
# TYPE go_memstats_mcache_sys_bytes gauge
go_memstats_mcache_sys_bytes 16384
# HELP go_memstats_mspan_inuse_bytes Number of bytes in use by mspan structures.
# TYPE go_memstats_mspan_inuse_bytes gauge
go_memstats_mspan_inuse_bytes 95184
# HELP go_memstats_mspan_sys_bytes Number of bytes used for mspan structures obtained from system.
# TYPE go_memstats_mspan_sys_bytes gauge
go_memstats_mspan_sys_bytes 131072
# HELP go_memstats_next_gc_bytes Number of heap bytes when next garbage collection will take place.
# TYPE go_memstats_next_gc_bytes gauge
go_memstats_next_gc_bytes 1.1478304e+07
# HELP go_memstats_other_sys_bytes Number of bytes used for other system allocations.
# TYPE go_memstats_other_sys_bytes gauge
go_memstats_other_sys_bytes 1.128577e+06
# HELP go_memstats_stack_inuse_bytes Number of bytes in use by the stack allocator.
# TYPE go_memstats_stack_inuse_bytes gauge
go_memstats_stack_inuse_bytes 1.245184e+06
# HELP go_memstats_stack_sys_bytes Number of bytes obtained from system for stack allocator.
# TYPE go_memstats_stack_sys_bytes gauge
go_memstats_stack_sys_bytes 1.245184e+06
# HELP go_memstats_sys_bytes Number of bytes obtained from system.
# TYPE go_memstats_sys_bytes gauge
go_memstats_sys_bytes 7.2286456e+07
# HELP go_threads Number of OS threads created.
# TYPE go_threads gauge
go_threads 16
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 126.65
# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
process_max_fds 1.048576e+06
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 13
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 4.1680896e+07
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1.57587041023e+09
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 1.199517696e+09
# HELP tunnel_ids The ID of all tunnels (and their corresponding HA connection ID) running in this instance of cloudflared.
# TYPE tunnel_ids gauge
tunnel_ids{ha_conn_id="0",tunnel_id="efg74ar4601k9lz3fz1ar5q9cyerdr4i4sf2942gs0ffq38an1ug"} 1
tunnel_ids{ha_conn_id="1",tunnel_id="efg74ar4601k9lz3fz1ar5q9cyerdr4i4sf2942gs0ffq38an1ug"} 1
tunnel_ids{ha_conn_id="2",tunnel_id="efg74ar4601k9lz3fz1ar5q9cyerdr4i4sf2942gs0ffq38an1ug"} 1
tunnel_ids{ha_conn_id="3",tunnel_id="efg74ar4601k9lz3fz1ar5q9cyerdr4i4sf2942gs0ffq38an1ug"} 1

About `cross-stream compression`

One thing worth highlighting in the metrics above is the reference to cross-stream compression, which is a beta feature of the Cloudflare Tunnel. Below is how to configure it.

cloudflared tunnel -help | grep compression
   --compression-quality value                         (beta) Use cross-stream compression instead HTTP compression. 0-off, 1-low, 2-medium, >=3-high. (default: 0) [$TUNNEL_COMPRESSION_LEVEL]

mw866 commented 4 years ago

I've run some synthetic performance tests for Catchpoint no my test environment. Results: https//p.catchpoint.com/ui/Entry/PC/V/AR8-C-D-K3t1ejXfQPmZYAA In summary, the performance difference between Tunnel vs GCP LB does exist, but is marginal 5% https in terms of response time:

rueian commented 4 years ago

截圖 2019-12-12 下午5 17 36

After comparing with different type of machines, It seems that if cloudflared can allocate memory faster, the latency will be slower and the throughput will be higher. See the bottom-left, bottom-right, and the top-right graph.

However the total memory of cloudflared keeps around 19x MB. I don't know why but maybe I can record some profiling graph in a couple of days.

mw866 commented 4 years ago

@rueian Thanks for helping to probe further. Good to see we managed to improve the rps and latency further by varying machine types,

To explain the correlation mentioned, we need to understand the insides of the cloudflared specifically which is unfortunately beyond my knowledge. Once we have the profiling graph, can you log an issue at the main cloudflare/cloudflared repo so that the developers and the commuty can review it too.

Quesitons

Are the Grafana charts in your screenshots stacked? e.g. In the Top Right graph, is the max 3k rps (a) the sum of all 4 connections OR (b) connection_id=1 only
What are machine types in the Top Right graph? Can it be further boosted with other machine types (e.g. CPU-/-Memory-optimized)?

rueian commented 4 years ago

Hi @mw866,

First, reply to your questions:

In the previous screen shot, the Top Right graph is NOT stacked.
In the previous screen shot, the metric waves are generated from 3 types of GCP machine: 1st wave: n1-highcpu-4 2nd wave: n1-highcpu-8 3rd wave: n1-highcpu-16

So, It seems that cloudflared is quite busy on allocating memory and releasing memory.

Luckily, after collecting the profiling graph, it do shows where is the cause and can be optimized.

It shows that almost all allocations come from the TunnelHandler.serveHTTP. I believed it is this line of code doing the allocation: https://github.com/cloudflare/cloudflared/blob/789ca6f6f46a72b5138a1700e476e8c1d04528bf/origin/tunnel.go#L645 , and it can be optimized with buffer pool mechanism.

The result of POC on n1-highcpu-4 machine:

After applying buffer pool mechanism:

The memory allocating rate drops to 70MB/s, which can be ~2.8GB/s in previous version.
The P99 latency collected by clients keeps under 50ms.
I can now achieve 5K+ throughput on n1-highcpu-4 machine, which can only be achieved on n1-highcpu-16 in my previous test (the Top Right graph is NOT stacked).

The commit of POC: https://github.com/rueian/cloudflared/commit/1d3173d6bc2995d9dabce5f03a1e8b4a364c0236

rueian commented 4 years ago

I have opened an issue and a PR to cloudflared to report this discovery. https://github.com/cloudflare/cloudflared/issues/160 https://github.com/cloudflare/cloudflared/pull/161

rueian / cf-argo-benchmark