aws / aws-xray-sdk-go

AWS X-Ray SDK for the Go programming language.
Apache License 2.0
280 stars 118 forks source link

Data races on http.DefaultTransport with rc15 #217

Open dcu opened 4 years ago

dcu commented 4 years ago
WARNING: DATA RACE
Read at 0x0000022e7450 by goroutine 12:
  net/http.(*Client).send()
      /usr/local/Cellar/go/1.14.1/libexec/src/net/http/client.go:176 +0x39d
  net/http.(*Client).do()
      /usr/local/Cellar/go/1.14.1/libexec/src/net/http/client.go:699 +0x2cc
  net/http.(*Client).Do()
      /usr/local/Cellar/go/1.14.1/libexec/src/net/http/client.go:567 +0x7f
  github.com/aws/aws-sdk-go/aws/corehandlers.sendFollowRedirects()
      /Users/dcuadrado/go/pkg/mod/github.com/aws/aws-sdk-go@v1.30.7/aws/corehandlers/handlers.go:120 +0x2f
  github.com/aws/aws-sdk-go/aws/corehandlers.glob..func3()
      /Users/dcuadrado/go/pkg/mod/github.com/aws/aws-sdk-go@v1.30.7/aws/corehandlers/handlers.go:112 +0xfd
  github.com/aws/aws-sdk-go/aws/request.(*HandlerList).Run()
      /Users/dcuadrado/go/pkg/mod/github.com/aws/aws-sdk-go@v1.30.7/aws/request/handlers.go:267 +0xd1
  github.com/aws/aws-sdk-go/aws/request.(*Request).sendRequest()
      /Users/dcuadrado/go/pkg/mod/github.com/aws/aws-sdk-go@v1.30.7/aws/request/request.go:578 +0xf0
  github.com/aws/aws-sdk-go/aws/request.(*Request).Send()
      /Users/dcuadrado/go/pkg/mod/github.com/aws/aws-sdk-go@v1.30.7/aws/request/request.go:531 +0x1bf
  github.com/aws/aws-sdk-go/service/xray.(*XRay).GetSamplingRules()
      /Users/dcuadrado/go/pkg/mod/github.com/aws/aws-sdk-go@v1.30.7/service/xray/api.go:869 +0x5e
  github.com/aws/aws-xray-sdk-go/strategy/sampling.(*proxy).GetSamplingRules()
      /Users/dcuadrado/go/pkg/mod/github.com/aws/aws-xray-sdk-go@v1.0.0-rc.15/strategy/sampling/proxy.go:86 +0x95
  github.com/aws/aws-xray-sdk-go/strategy/sampling.(*CentralizedStrategy).refreshManifest()
      /Users/dcuadrado/go/pkg/mod/github.com/aws/aws-xray-sdk-go@v1.0.0-rc.15/strategy/sampling/centralized.go:264 +0x15c
  github.com/aws/aws-xray-sdk-go/strategy/sampling.(*CentralizedStrategy).startRulePoller.func1()
      /Users/dcuadrado/go/pkg/mod/github.com/aws/aws-xray-sdk-go@v1.0.0-rc.15/strategy/sampling/centralized.go:207 +0x50

Previous write at 0x0000022e7450 by goroutine 19:
  github.com/jarcoal/httpmock.Deactivate()
      /Users/dcuadrado/go/pkg/mod/github.com/jarcoal/httpmock@v1.0.5/transport.go:716 +0xaf
  github.com/jarcoal/httpmock.DeactivateAndReset()
      /Users/dcuadrado/go/pkg/mod/github.com/jarcoal/httpmock@v1.0.5/transport.go:740 +0x2f
...

Goroutine 12 (running) created at:
  github.com/aws/aws-xray-sdk-go/strategy/sampling.(*CentralizedStrategy).startRulePoller()
      /Users/dcuadrado/go/pkg/mod/github.com/aws/aws-xray-sdk-go@v1.0.0-rc.15/strategy/sampling/centralized.go:206 +0x4c
  github.com/aws/aws-xray-sdk-go/strategy/sampling.(*CentralizedStrategy).start()
      /Users/dcuadrado/go/pkg/mod/github.com/aws/aws-xray-sdk-go@v1.0.0-rc.15/strategy/sampling/centralized.go:196 +0x104
  github.com/aws/aws-xray-sdk-go/strategy/sampling.(*CentralizedStrategy).ShouldTrace()
      /Users/dcuadrado/go/pkg/mod/github.com/aws/aws-xray-sdk-go@v1.0.0-rc.15/strategy/sampling/centralized.go:134 +0xcd1
  github.com/aws/aws-xray-sdk-go/xray.BeginSegmentWithSampling()
      /Users/dcuadrado/go/pkg/mod/github.com/aws/aws-xray-sdk-go@v1.0.0-rc.15/xray/segment.go:80 +0x5fe
  github.com/aws/aws-xray-sdk-go/xray.BeginSegment()
      /Users/dcuadrado/go/pkg/mod/github.com/aws/aws-xray-sdk-go@v1.0.0-rc.15/xray/segment.go:60 +0x9c
...
bhautikpip commented 4 years ago

Hi @dcu ,

would you mind describing your application setup and provide repro steps ?

bhautikpip commented 4 years ago

Looks like data race is not happening within X-Ray SDK but when this goroutine (https://github.com/aws/aws-xray-sdk-go/blob/master/strategy/sampling/centralized.go#L206) tries to make a http request, this codebase (https://github.com/jarcoal/httpmock/blob/v1/transport.go#L716) from httpmock library updates the value of http.DefaultTransport.

dcu commented 4 years ago

is there a way to disable xray so I can do that in my tests?

bhautikpip commented 4 years ago

Hi @dcu ,

Sorry for the inconvenience. Currently we don't have support for disabling XRay in XRay Go SDK. We have a pending item in our backlog to support this feature. Meanwhile I am looking more into this issue to see if we can provide any work around to avoid this issue.

dcu commented 4 years ago

maybe allowing to set my own transport or http client

bhautikpip commented 4 years ago

Hi @dcu ,

That's a good idea I will look more into it to see if we can do that. Also, I'm currently working on disabling XRay Go SDK using environment variable (https://github.com/aws/aws-xray-sdk-go/pull/219). Let me know if this will be able to solve your issue. Would you mind trying this change out and provide your feedback ?

dcu commented 4 years ago

I just did, setting AWS_XRAY_SDK_DISABLED to TRUE works and doesn't produce a race condition in my tests.

Still I think you should also allow to set the http client since the default one is not great, pool size is too limited and timeouts are not configured by default

bhautikpip commented 4 years ago

Hi @dcu ,

That's good to hear. Would you mind providing more details about http client approach you're talking about ? I want to understand more on that so that we can provide best possible solution to avoid this issue.

dcu commented 4 years ago

from what I see, XRay is using the default HTTP client to send the data to its API. It'd be better if you expose a way to change that client so I can use my own configured client.

In the offical AWS SDK you can configure this when creating a session.Session by passing a aws.Config but it doesn't seem possible to do that with the xray SDK

dcu commented 4 years ago

In your patch, if I disable xray would I still catch panics like this: failed to begin subsegment named 'kinesis': segment cannot be found. ?

bhautikpip commented 4 years ago

No, Ideally you shouldn't receive those panics if you have disabled xray. Are you getting those panics ?

dcu commented 4 years ago

no I didn't test that. My concern is not catching those in tests but later after deployment

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

dcu commented 4 years ago

any news on this issue?

bhautikpip commented 4 years ago

Hi @dcu ,

We still need to prioritize those efforts as you have mentioned (exposing a way to change default client while calling X-Ray API) in comments. I will put this item in our backlog. Meanwhile we have this PR ready to merge in (https://github.com/aws/aws-xray-sdk-go/pull/219). Let me know if you have any concerns about this PR and we can work accordingly.