Closed while1malloc0 closed 5 years ago
Thank you for submitting this pull request, however I do not see a valid CLA on file for you. Before we can merge this request please visit https://yahoocla.herokuapp.com/ and agree to the terms. Thanks! :smile:
@while1malloc0 thanks for the PR! Great catch - mind signing the CLA the bot posted above? Once that is complete, I can merge this.
Out of curiosity, did you run into this due to a flaky run of the unit tests?
@byxorna Thanks for the awesome project!
I was unable to reproduce a failing test case for the data race on my laptop, but it showed up a few times in CI. The test would fail because actualEvents
was one less than it should have been, which seems to point to this data race.
This was found as part of some work I was doing on a fork of this project for my employer. We needed the functionality described in https://github.com/tumblr/k8s-sidecar-injector/issues/17. As such, I'll have to get sign off on that CLA. I'll update here once I have that.
~I just signed the CLA. Do I need to push to kick off another build?~ edit: it looks like the build updated automatically.
Also, I'm starting to doubt my solution here, and maybe a reviewer can sanity check me. I think that there still might be a data race between the read here: https://github.com/tumblr/k8s-sidecar-injector/pull/18/files#diff-94510f2b305f33933564c2e9ae1e2af7R71 and the read from the stop channel here https://github.com/tumblr/k8s-sidecar-injector/pull/18/files#diff-94510f2b305f33933564c2e9ae1e2af7R54. The race detector seems happy about this solution, but it feels like this solution relies on the read from select
occurring before the two instructions between stopCh <- struct{}{}
finish.
Maybe a better approach would be to use a separate sync.WaitGroup
(or reuse the existing one) in the reading goroutine. I think that would require Coalesce
to close the output channel though, since it's read-only to the caller. It seems like it's safe to call close(output)
at this line https://github.com/tumblr/k8s-sidecar-injector/blob/master/pkg/coalescer/coalescer.go#L48, since we've sent the final event.
You can use the existing waitgroup to make this code race free, as well. wg.Add(1) at L46, and then add a wg.Done() at L58 (and remove your stop channel). That way we will block wg.Wait() until all goroutines have indicated they are done.
I really like the idea of that approach so that there's less extra code needed, but when I add that the test never finishes. It seems like output == nil
is never true.
You can use the existing waitgroup to make this code race free, as well. wg.Add(1) at L46, and then add a wg.Done() at L58 (and remove your stop channel). That way we will block wg.Wait() until all goroutines have indicated they are done.
I really like the idea of that approach so that there's less extra code needed, but when I add that the test never finishes. It seems like
output == nil
is never true.
Yea, great point. I think something like this will cover us:
for {
select {
case v, ok:=<-output:
if !ok {
break
}
actualEvents++
}
}
Won't that have the same problem? Since output
is never being closed, that ok
won't ever fire and we'll block forever reading on the channel?
Won't that have the same problem? Since
output
is never being closed, thatok
won't ever fire and we'll block forever reading on the channel?
Ah yes! So, now that i reread it, we don't actually need to make sure the goroutine that consumes output
needs to be closed - our expectation is that we await the input
routine, wait for the debounce period, and then assert the number of things coming out of the event coalescer is what we expect. To clean up, adding a close(output)
after time.Sleep(debounceDuration)
should be a way to ensure that goroutine is terminated, but we dont need to imo. Thoughts?
heya @while1malloc0, just bumping this thread. Any thoughts on my comment above?
Hey @byxorna, sorry for letting this fall of my radar for a bit.
To clean up, adding
close(output)
aftertime.Sleep
...
I don't think you can close output from the test, since it's a receive-only channel. Since the line stop <-struct{}{}
is synchronous, I think this solution should work unless you think there's a better way to do this.
@while1malloc0 Ok, that sounds good. I have no objections to how this is solved right now. Are you happy with this solution? If so, we can loop in a few more sets of eyes and get this merged :)
Yeah, I'm happy with this. Feel free to merge at your convenience.
@while1malloc0 thanks! merging now - the build tagged :latest
will be out in ~20min
This commit fixes a data race that occurs in the coalescer test when the actualEvents variable is read from in the test assertion before the last event has been written to it.
What and why?
The file pkg/coalescer/coalescer_test.go contains a data race that can occasionally fail a test run. This can be verified with
go test -race ./pkg/coalescer
.Reviewers
Required reviewers:
@byxorna