gliderlabs / docker-alpine

Alpine Linux Docker image. Win at minimalism!
http://gliderlabs.viewdocs.io/docker-alpine
BSD 2-Clause "Simplified" License
5.71k stars 531 forks source link

CDN performance unexplicably terrible #164

Open krallin opened 8 years ago

krallin commented 8 years ago

When installing packages via the Fastly-provided CDN, performance is terrible from Europe, whereas the reverse is true when accessing via an AWS instance in us-east-1.

Here are some timings:

From my workstation, in Europe:

$ docker run -it gliderlabs/alpine:3.3 time apk add --update --repository http://nl.alpinelinux.org/alpine/v3.3/community/ --repositories-file /dev/null go
fetch http://nl.alpinelinux.org/alpine/v3.3/community/x86_64/APKINDEX.tar.gz
(1/1) Installing go (1.5.3-r0)
Executing busybox-1.24.1-r7.trigger
OK: 177 MiB in 12 packages
real    0m 29.71s
user    0m 0.79s
sys 0m 6.31s

$ docker run -it gliderlabs/alpine:3.3 time apk add --update go                                                            
fetch http://alpine.gliderlabs.com/alpine/v3.3/main/x86_64/APKINDEX.tar.gz
fetch http://alpine.gliderlabs.com/alpine/v3.3/community/x86_64/APKINDEX.tar.gz
(1/1) Installing go (1.5.3-r0)
Executing busybox-1.24.1-r7.trigger
OK: 177 MiB in 12 packages
real    14m 14.96s
user    0m 0.75s
sys 0m 8.38s

From an AWS instance, in us-east-1:

$ docker run -it gliderlabs/alpine:3.3 time apk add --update --repository http://nl.alpinelinux.org/alpine/v3.3/community/ --repositories-file /dev/null go
fetch http://nl.alpinelinux.org/alpine/v3.3/community/x86_64/APKINDEX.tar.gz
(1/1) Installing go (1.5.3-r0)
Executing busybox-1.24.1-r7.trigger
OK: 177 MiB in 12 packages
real    3m 0.32s
user    0m 2.27s
sys 0m 0.95s

$ docker run -it gliderlabs/alpine:3.3 time apk add --update go
fetch http://alpine.gliderlabs.com/alpine/v3.3/main/x86_64/APKINDEX.tar.gz
fetch http://alpine.gliderlabs.com/alpine/v3.3/community/x86_64/APKINDEX.tar.gz
(1/1) Installing go (1.5.3-r0)
Executing busybox-1.24.1-r7.trigger
OK: 177 MiB in 12 packages
real    0m 35.05s
user    0m 2.11s
sys 0m 0.78s

When connecting from an AWS instance in Europe, performance is pretty good, which leads me to believe this has to do with residential internet rather than geographical location. So, since you mentioned a partnership with Fastly, I figured this might be the right place to ask: is there anything I can do to help understand / troubleshoot this (though I understand it's entirely possible that you or Fastly might be unable to fix this).

Thanks!

andyshinn commented 8 years ago

That is definitely pretty slow. Can you perform some traceroutes to dl-cdn.alpinelinux.org and nl.alpinelinux.org and see if the POPs being resolved for Fastly are somewhere unexpected? It is possible that the Fastly POP being chosen for your locale is just wrong.

krallin commented 8 years ago

@andyshinn Sure; here you go:

 1  router (192.168.1.1)  1.878 ms  1.286 ms  1.830 ms
 2  80.10.236.89 (80.10.236.89)  20.998 ms  6.898 ms  7.283 ms
 3  ae113-0.ncidf104.paris.francetelecom.net (193.249.212.38)  6.109 ms  6.668 ms  6.889 ms
 4  ae44-0.niaub102.aubervilliers.francetelecom.net (193.252.159.46)  9.978 ms  6.712 ms  6.606 ms
 5  193.252.137.70 (193.252.137.70)  22.008 ms  24.679 ms  23.751 ms
 6  tengige0-6-0-34.lontr4.london.opentransit.net (193.251.242.81)  17.644 ms  24.062 ms  22.367 ms
 7  level3-1.gw.opentransit.net (193.251.255.80)  16.755 ms  24.164 ms  17.299 ms
 8  ae-2-70.edge5.frankfurt1.level3.net (4.69.154.73)  24.240 ms  22.962 ms  22.228 ms
 9  ae-2-70.edge5.frankfurt1.level3.net (4.69.154.73)  22.416 ms  26.187 ms  24.408 ms
10  * * *
11  185.31.17.249 (185.31.17.249)  40.755 ms  38.819 ms  41.751 ms

As for why it's hitting that domain:

; <<>> DiG 9.8.3-P1 <<>> dl-cdn.alpinelinux.org
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 8447
;; flags: qr rd ra; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;dl-cdn.alpinelinux.org.        IN  A

;; ANSWER SECTION:
dl-cdn.alpinelinux.org. 3572    IN  CNAME   global.prod.fastly.net.
global.prod.fastly.net. 17  IN  CNAME   global-ssl.fastly.net.
global-ssl.fastly.net.  29  IN  CNAME   fallback.global-ssl.fastly.net.
fallback.global-ssl.fastly.net. 26 IN   A   185.31.17.249

;; Query time: 13 msec
;; SERVER: 192.168.1.1#53(192.168.1.1)
;; WHEN: Wed Apr 20 18:22:35 2016
;; MSG SIZE  rcvd: 140

For comparison, nl.alpinelinux.org:

traceroute to nl.alpinelinux.org (88.159.20.184), 64 hops max, 72 byte packets
 1  router (192.168.1.1)  1.858 ms  1.597 ms  1.223 ms
 2  80.10.236.89 (80.10.236.89)  6.732 ms  9.816 ms  8.768 ms
 3  ae113-0.ncidf104.paris.francetelecom.net (193.249.212.38)  12.191 ms  7.472 ms  7.627 ms
 4  ae44-0.niaub102.aubervilliers.francetelecom.net (193.252.159.46)  6.416 ms  6.145 ms  9.617 ms
 5  193.252.137.70 (193.252.137.70)  21.212 ms  23.017 ms  23.982 ms
 6  tengige0-6-0-27.lontr4.london.opentransit.net (193.251.240.163)  22.494 ms  22.331 ms  24.877 ms
 7  level3-1.gw.opentransit.net (193.251.255.80)  18.210 ms  17.460 ms  16.911 ms
 8  ae-228-3604.edge5.amsterdam1.level3.net (4.69.162.158)  25.906 ms  27.941 ms  23.742 ms
 9  * * *
10  * * *
11  88.159.0.106 (88.159.0.106)  36.410 ms  26.918 ms  26.740 ms
12  184-20-159-88.business.edutel.nl (88.159.20.184)  34.497 ms  29.269 ms  38.215 ms
krallin commented 8 years ago

At a glance, this doesn't seem wrong (at least to me), but curious if something's wrong with that 185.31.17.249 POP. I'll try and see if I can hit one of the POPs advertised to my EC2 instance with more success (though perhaps those IPs are anycast?).

krallin commented 8 years ago

All right, so there's definitely something (very) off with that POP. If I hit 199.27.76.249 instead, which is advertised to my EC2 instance in us-east-1 and supposedly located in SF, I get about 8 times the bandwidth.

krallin commented 8 years ago

Route to that (faster) POP is:

traceroute to alpine.gliderlabs.com (199.27.76.249), 64 hops max, 72 byte packets
 1  router (192.168.1.1)  2.967 ms  1.277 ms  1.576 ms
 2  80.10.236.89 (80.10.236.89)  7.101 ms  6.490 ms  5.958 ms
 3  ae113-0.ncidf104.paris.francetelecom.net (193.249.212.38)  6.301 ms  5.908 ms  5.860 ms
 4  ae44-0.niaub102.aubervilliers.francetelecom.net (193.252.159.46)  6.923 ms  6.416 ms  5.975 ms
 5  193.252.137.70 (193.252.137.70)  19.720 ms  23.632 ms  23.469 ms
 6  tengige0-6-0-27.lontr4.london.opentransit.net (193.251.240.163)  19.812 ms  23.544 ms  23.784 ms
 7  level3-1.gw.opentransit.net (193.251.255.80)  18.128 ms  17.705 ms  19.543 ms
 8  * * *
 9  * * *
10  alpine.gliderlabs.com (199.27.76.249)  93.477 ms  93.305 ms  92.992 ms
andyshinn commented 8 years ago

Is this something that is still a consistent problem or was it just a one-off thing?