aws / aws-xray-daemon

The AWS X-Ray daemon listens for traffic on UDP port 2000, gathers raw segment data, and relays it to the AWS X-Ray API.
Apache License 2.0
191 stars 69 forks source link

[Error] Unable to fetch region from EC2 metadata: EC2MetadataRequestError: failed to get EC2 instance identity document #203

Open florianakos opened 1 year ago

florianakos commented 1 year ago

Hi

We are running x-ray-daemon @ 3.3.7 (latest at the time) on EC2 instances as part of an ECS task. Until recently we used to run 3.2.0 but encountered issues when we tried to enforce use of EC2 IMDSv2 via a config setting. Tried to upgrade to 3.3.7 which I thought would help, but apparently it seems it does not, the daemon fails to start:

2023-08-10T11:42:19Z [Info] Initializing AWS X-Ray daemon 3.3.7
2023-08-10T11:42:23Z [Error] Cannot fetch region variable from config file, environment variables, ecs metadata, or ec2 metadata. Use local-mode to use the local session region.
status code: 401, request id:
caused by: EC2MetadataError: failed to make EC2Metadata request
2023-08-10T11:42:23Z [Error] Unable to fetch region from EC2 metadata: EC2MetadataRequestError: failed to get EC2 instance identity document

After digging in the repository, it seems to me that the issue might be caused by the fact that AWS SDK for GO (v1) does not offer support to seamlessly use IMDSv2: https://github.com/aws/aws-xray-daemon/blob/56bcdadc0e5808f4428ed6e3e54a88a2ceca2f82/pkg/conn/conn.go#L45C11-L45C11

Based on my research, AWS SDK for GO (v2) should have such support out of the box. Can you confirm and advise when such support could be expected in x-ray-daemon? Thanks!

wangzlei commented 1 year ago

Hi, thanks for providing this feedback. In IMDSv2 public doc "Compatibility with AWS SDKS" it seems AWS SDK for go v1 is ok, we will do more investigation.

As a workaround currently, could you please add -o in command line to bypass the IMDS and manually set region by -n?

florianakos commented 1 year ago

Thanks for the response.

It seems indeed v1 of the Go SDK should offer support for using the IMDSv2. Can you comment on why it does not seem to work in xray-daemon? Is there some changes needed to make that happen?

I was able to mitigate that issue with the region by using the ENV variable (AWS_REGION), but now running into another issue with pkg/telemetry, when it tries to get the instance ID (here):

2023-09-21T07:01:23Z [Error] Get instance id metadata failed: EC2MetadataError: failed to make EC2Metadata request
status code: 401, request id:
wangzlei commented 1 year ago

I order an EC2 instance(Linux) and launch xray daemon by following https://docs.aws.amazon.com/xray/latest/devguide/xray-daemon-ec2.html without EC2MetadataError issue.

Could you tell how do you run xray daemon?

cheruvian commented 4 weeks ago

We are seeing this issue as well:

2024-10-29T23:50:21Z [Info] Initializing AWS X-Ray daemon 3.3.13
2024-10-29T23:50:21Z [Info] Using buffer memory limit of 318 MB
2024-10-29T23:50:21Z [Info] 5088 segment buffers allocated
2024-10-29T23:50:34Z [Error] Unable to fetch region from EC2 metadata: EC2MetadataRequestError: failed to get EC2 instance identity document
caused by: RequestError: send request failed
caused by: Get "http://169.254.169.254/latest/dynamic/instance-identity/document": context deadline exceeded (Client.Timeout exceeded while awaiting headers)

2024-10-29T23:50:34Z [Error] Cannot fetch region variable from config file, environment variables, ecs metadata, or ec2 metadata. Use local-mode to use the local session region.

Any idea on fixes? For now we have mitigated by using AWS_REGION but this killed our ECS deployments since we had xray marked as essential.

FWIW this is what the log looks like when we set AWS_REGION

2024-10-30T00:03:12Z [Info] Using buffer memory limit of 318 MB
2024-10-30T00:03:12Z [Info] 5088 segment buffers allocated
2024-10-30T00:03:12Z [Info] Using region: us-east-1
2024-10-30T00:03:37Z [Error] Get instance id metadata failed: RequestError: send request failed
caused by: Get "http://169.254.169.254/latest/meta-data/instance-id": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
2024-10-30T00:03:37Z [Info] HTTP Proxy server using X-Ray Endpoint : https://xray.us-east-1.amazonaws.com
2024-10-30T00:03:37Z [Info] Starting proxy http server on 0.0.0.0:2000
[root@ip-10-0-178-89 _data]# docker logs 320232532df7
2024-10-30T00:03:12Z [Info] Initializing AWS X-Ray daemon 3.3.13
2024-10-30T00:03:12Z [Info] Using buffer memory limit of 318 MB
2024-10-30T00:03:12Z [Info] 5088 segment buffers allocated
2024-10-30T00:03:12Z [Info] Using region: us-east-1
2024-10-30T00:03:37Z [Error] Get instance id metadata failed: RequestError: send request failed
caused by: Get "http://169.254.169.254/latest/meta-data/instance-id": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
2024-10-30T00:03:37Z [Info] HTTP Proxy server using X-Ray Endpoint : https://xray.us-east-1.amazonaws.com
2024-10-30T00:03:37Z [Info] Starting proxy http server on 0.0.0.0:2000