JuliaCloud / AWS.jl

Julia interface to AWS
MIT License
160 stars 62 forks source link

IMDS connection timeouts can slow down `AWSConfig` #651

Closed omus closed 1 year ago

omus commented 1 year ago

In #649 there was an unhandled connection timeout exception. We now handle that exception but the connection timeout specified isn't respected:

julia> e = @time try; HTTP.request("PUT", "http://169.254.169.254/latest/api/token"; connect_timeout=1, retry=false, status_exception=true); catch e; e; end
 11.267654 seconds (206 allocations: 75.328 KiB)
HTTP.Exceptions.RequestError(HTTP.Messages.Request:
"""
PUT /latest/api/token HTTP/1.1
Host: 169.254.169.254
Accept: */*
User-Agent: HTTP.jl/1.8.5
Content-Length: 0
Accept-Encoding: gzip

""", Base.IOError("read: connection timed out (ETIMEDOUT)", -110))

This connection timeout is taking up the majority of the time to initialize a AWSConfig:

julia> @time try AWSConfig(); catch e; e; end
 11.243566 seconds (272 allocations: 82.812 KiB)
Can't find AWS credentials!

This error can be reproduced by running a Docker container within an EC2 instance:

docker run -it julia:1.8.5-buster julia -e 'using Pkg; Pkg.add(PackageSpec(name="AWS", version=v"1.90.0")); using AWS; AWSConfig()'
omus commented 1 year ago

The changes in #655 mitigate this issue as the problem. The issue occurring here is that the IMDSv2 token request packet is being dropped due to hitting the TTL limit. When this occurs HTTP.jl is waiting for a response from the server which is why we get this timeout. In #655 we now include a warning which lets the user know of this problem and provides guidance for correcting it:

julia> @time IMDS.request(IMDS.Session(), "GET", "/latest/meta-data/placement/region")
┌ Warning: IMDSv2 token request rejected due to reaching hop limit. Consider increasing the hop limit to avoid delays upon initial use:
│ https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instancedata-data-retrieval.html#imds-considerations
└ @ AWS.IMDS ~/.julia/packages/AWS/FdHPe/src/IMDS.jl:69
 11.264893 seconds (613 allocations: 301.398 KiB)
HTTP.Messages.Response:
"""
HTTP/1.1 200 OK
Content-Type: text/plain
Accept-Ranges: none
Last-Modified: Tue, 01 Aug 2023 17:31:40 GMT
Content-Length: 9
Date: Tue, 01 Aug 2023 18:10:44 GMT
Server: EC2ws
Connection: close

us-east-2"""

Setting the hop limit to 2 in this case fixes the delay:

julia> @time IMDS.request(IMDS.Session(), "GET", "/latest/meta-data/placement/region")
  0.001764 seconds (460 allocations: 429.383 KiB)
HTTP.Messages.Response:
"""
HTTP/1.1 200 OK
X-Aws-Ec2-Metadata-Token-Ttl-Seconds: 600
Content-Type: text/plain
Accept-Ranges: none
Last-Modified: Tue, 01 Aug 2023 17:31:40 GMT
Content-Length: 9
Date: Tue, 01 Aug 2023 18:10:54 GMT
Server: EC2ws
Connection: close

us-east-2"""

julia> @time try AWSConfig(); catch e; e; end
  0.000294 seconds (1 allocation: 16 bytes)
UndefVarError(:AWSConfig)