Open byrneo opened 2 years ago
Does Fluent Bit function normally and successfully send logs after startup? Does these errors only occur on startup?
[2022/04/21 09:58:51] [error] [src/flb_network.c:224 errno=9] Bad file descriptor [2022/04/21 09:58:51] [error] [http_client] broken connection to 169.254.169.254:80 ?
Both of these errors are almost certainly the same root error- first the core network library logs the "Bad file descriptor" message, then the http client logs that thus the connection is broken. 169.254.169.254
is the EC2 IMDS IP. Notice the lines after this about setting a hop limit.
What's happening here is that when each AWS plugin instance is initialized, each one must initialize its credential providers. So it will go through the standard chain of AWS credential sources, including EC2 IMDS, and look for creds. This will happen for each AWS output instance. Hence, you probably got one error message per output instance. For the EC2 provider, it tries IMDS version 2 first: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/configuring-instance-metadata-service.html
If this fails it falls back to IMDS v1 style requests, where the auth token is omitted.
So I think what is happening here is expected. We wish the errors here were more clear to prevent confusion. @matthewfala Did I miss anything and can you think of any ways to improve the error messaging here?
That's right, @PettitWesley. We're thinking that the issue is a combination of the following:
AWS recommends using IMDSv2, so in order to do that, you'll need to set hop limit to 2 or greater so network within the container can access the IMDS endpoint properly: https://github.com/aws/aws-for-fluent-bit/issues/259#issuecomment-970862321
If you don't want to go through the trouble of increasing the hop limit, you can also enable IMDSv1, in which case it should be detected and used by Fluent Bit.
Sorry for the late response @PettitWesley @matthewfala . Yes: FluentBit did indeed appear to function normally despite the startup errors.
I've made a bunch of changes in my environment since creating this issue: one of which was to use IRSA with Fluentbit (previously i had been using an IAM instance role/profile for the ec2 host). I can't be 100% certain that made the difference, but i no longer see the errors during startup any more.
@PettitWesley @matthewfala I have been struggling with the IMDS related issues , I am using the latest image 2.31.2
[2023/02/22 22:03:10] [error] [net] connection #44 timeout after 10 seconds to: 169.254.169.254:80
[2023/02/22 22:03:10] [error] [filter:aws:aws.0] connection initialization error
[2023/02/22 22:03:10] [error] [filter:aws:aws.0] Could not retrieve ec2 metadata from IMDS
[0] dummy: [1677103380.297254617, {"message"=>"dummy"}]
This is what I have in configmap
[INPUT]
Name dummy
Tag dummy
[FILTER]
Name aws
Match *
imds_version v2
az true
ec2_instance_id true
ec2_instance_type true
private_ip true
ami_id true
account_id true
hostname true
vpc_id true
[OUTPUT]
Name stdout
Match *
I tried changing the hop count to 2 , snip from the ec2 describe
MetadataOptions": {
"State": "applied",
"HttpTokens": "optional", --> tried even with required
"HttpPutResponseHopLimit": 2,
"HttpEndpoint": "enabled",
"HttpProtocolIpv6": "disabled",
"InstanceMetadataTags": "disabled"
}
I am trying to use this metadata plugin to enrich the logs for the instance_id in specific , is there something I am missing ? what is required to be set from ec2 side to get this https://docs.fluentbit.io/manual/pipeline/filters/aws-metadata to work
@vkadi that should work... what network setup are your containers running in? Can you try ssh/kubectl exec into the pod and see if you can reach IMDS via curl?
@PettitWesley I am running this on a EKS cluster and from pods I am not able to access the metadata
bash-4.2# curl http://169.254.169.254/latest/meta-data/
curl: (28) Failed to connect to 169.254.169.254 port 80 after 129614 ms: Couldn't connect to server
@vkadi then something about your network configuration is blocking access. I am not sure what. I know there are some CNI plugins that will block link local IP addresses from pods, which would block IMDS.
@PettitWesley By enabling "hostNetwork: true" I was able to access the IMDS on fluentbit pod as mentioned here in this doc - https://docs.fluentbit.io/manual/pipeline/filters/kubernetes
Fluent Bit Log Output
Fluent Bit Version Info
Fluent Bit v1.8.15
AWS for Fluent Bit Container Image Version 2.23.3
Cluster Details
Application Details
Steps to reproduce issue
Related Issues