open-telemetry / opentelemetry-go-contrib

Collection of extensions for OpenTelemetry-Go.
https://opentelemetry.io/
Apache License 2.0
1.13k stars 536 forks source link

ec2 detector does not respect context timeout #5635

Open OrHayat opened 3 months ago

OrHayat commented 3 months ago

Description

ec2 detector does not respect context timeout and it wait ~30 seconds on non-ec2 machines(default)

Environment

any env that is not on ec2 machine

Steps To Reproduce

  1. Using this code snippet ctx, cancel := context.WithTimeout(context.Background(), time.Second*2) defer cancel() ctx:=context.WithTimeOut(ctx) ec2detector.NewResourceDetector().Detect(ctx)
  2. Run this code on any machine that is NOT in aws ec2

Expected behavior

expected the code to stop the detection attempt much much faster(2 seconds) but there issues in the ec2 detector code that dont allow this to work as expected 1) it uses the soon to be deprecated aws v1 sdk that dont force you to pass context to aws operation(you can still but the code didnt did that)

2) also if this will be fixed then there is issue in the opentelemtry resource.detect function(in the opentelemetry-go/sdk/resource package) that will mean that every detector that was specified after this detector when creating new resource will get context that expired because they dont run in parallel. but that issue is for the main open-telemetry repo

bhaskarbanerjee commented 3 months ago

Is the ec2 detector at all meant to run on non ec2 machines?

MrAlias commented 3 months ago

Is the ec2 detector at all meant to run on non ec2 machines?

No it is not. Though, it should fail gracefully when it is not running there. I think eating up all the context time-out is not idea.

MrAlias commented 3 months ago

We should determine the SLA for the AWS response time to the EC2 metadata endpoint. The detector should then create a child context that has that set as its timeout so the parent timeout isn't completely used.