go-graphite / carbonzipper

proxy to transparently merge graphite carbon backends
Other
104 stars 29 forks source link

Setup an empty response to merge into #62

Closed gunnihinn closed 6 years ago

gunnihinn commented 6 years ago

Instead of using the first response that comes along as a base for our reply and merging other responses into it, we setup an empty response that we merge other ones into. This fixes the race condition unveiled in commit a53ec78, as well as a host of race conditions we encounted in production, starting with [1] below.

I'm not sure what the problem here was. At first I thought that merging the responses inside the select statement was the source of our problems, so I moved the merging out of it. That got rid of the data race in [1], but introduced new ones in

carbonzipper/zipper/types.(*ServerFetchResponse).Merge()

where the slice in first.Response.Metrics was being modified somewhere else while we were looping through it. I was not able to find where this was actually happening after staring at stack traces program internals with delve, so we'll content ourselves with having whacked this mole somehow, mysteriously.

[1] Production data race stacktrace:

==================
WARNING: DATA RACE
Read at 0x00c42012c380 by goroutine 946:
github.com/go-graphite/carbonapi/vendor/github.com/go-graphite/carbonzipper/zipper/types.(*ServerFetchResponse).Merge()
/home/redacted/go/src/github.com/go-graphite/carbonapi/vendor/github.com/go-graphite/carbonzipper/zipper/types/stats.go:23 +0xc2b
github.com/go-graphite/carbonapi/vendor/github.com/go-graphite/carbonzipper/zipper/broadcast.(*BroadcastGroup).Fetch()
/home/redacted/go/src/github.com/go-graphite/carbonapi/vendor/github.com/go-graphite/carbonzipper/zipper/broadcast/broadcast_group.go:234 +0xcd2
github.com/go-graphite/carbonapi/vendor/github.com/go-graphite/carbonzipper/zipper/broadcast.(*BroadcastGroup).doSingleFetch()
/home/redacted/go/src/github.com/go-graphite/carbonapi/vendor/github.com/go-graphite/carbonzipper/zipper/broadcast/broadcast_group.go:179 +0x77f
Previous write at 0x00c42012c380 by goroutine 1720:
github.com/go-graphite/carbonapi/vendor/github.com/go-graphite/carbonzipper/zipper/types.(*ServerFetchResponse).Merge()
/home/redacted/go/src/github.com/go-graphite/carbonapi/vendor/github.com/go-graphite/carbonzipper/zipper/types/stats.go:23 +0xc1d
github.com/go-graphite/carbonapi/vendor/github.com/go-graphite/carbonzipper/zipper/broadcast.(*BroadcastGroup).Fetch()
/home/redacted/go/src/github.com/go-graphite/carbonapi/vendor/github.com/go-graphite/carbonzipper/zipper/broadcast/broadcast_group.go:234 +0xcd2
github.com/go-graphite/carbonapi/vendor/github.com/go-graphite/carbonzipper/zipper/broadcast.(*BroadcastGroup).doSingleFetch()
/home/redacted/go/src/github.com/go-graphite/carbonapi/vendor/github.com/go-graphite/carbonzipper/zipper/broadcast/broadcast_group.go:179 +0x77f
[...]
Civil commented 6 years ago

Lgtm. Maybe put somewhere a note to dig into that later