Closed kl52752 closed 4 months ago
/cc @AwesomePatrol /assign @bowei
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by: AwesomePatrol, kl52752 Once this PR has been reviewed and has the lgtm label, please ask for approval from bowei. For more information see the Kubernetes Code Review Process.
The full list of commands accepted by this bot can be found here.
Did you upload the whole PR? There is only one change (delete Log line)
Did you upload the whole PR? There is only one change (delete Log line)
yes, the problem is that this one log line in some test is called after test return and we believe that this is causing panic. https://screenshot.googleplex.com/3tRWo69XkSfvacf. Calling t.Log before sleep is safe because we know that this line will always be called during test execution. But sleep is not interrupted by context cancel and when in test we expect that executor will exit after timeout which is less than 5sec we abandon the go routine and it can finish after whole test exits. This race happens once in prow we didn't reproduced it locally.
The construction of the test is a bit concerning as it depends very much on timing. Is there a way to redo as much of it as possible in terms of happens before relationships?
something like this:
package exec
import (
"fmt"
"sync"
)
func NewSync() *Sync { return &Sync{states: map[string]*state{}} }
type Sync struct {
lock sync.Mutex
states map[string]*state
}
type state struct {
done bool
waiters []*waiter
}
type waiter struct {
ch chan struct{}
}
// Waiters
func (s *Sync) Wait(key string) {
s.lock.Lock()
st, ok := s.states[key]
if ok && st.done {
s.lock.Unlock()
return
}
if !ok {
st := &state{}
s.states[key] = st
}
// Add to the waiters and block.
w := &waiter{make(chan struct{})}
st.waiters = append(st.waiters, w)
s.lock.Unlock()
<-w.ch
}
func (s *Sync) Signal(key string) {
s.lock.Lock()
st, ok := s.states[key]
if !ok {
s.states[key] = &state{done: true}
s.lock.Unlock()
return
}
st.done = true
for _, w := range st.waiters {
close(w.ch)
}
s.lock.Unlock()
}
func example() {
s := NewSync()
go func() {
s.Wait("EVENT")
fmt.Println("Happens after EVENT")
}()
go func() {
fmt.Println("EVENT signaled")
s.Signal("EVENT")
}()
}
@bowei this test checks how executor behaves when timeout occurs so by definition this test depends on timing. I think that your proposal is over complicated for this simple test (we have only 1 action that needs to be time consuming) and it does not remove time dependency completely because still we need to control when task will be finished. I don't think that spinning another go routines with channel sin test is better solutions than using sleep here.
I added struct that checks that this (one) go routine has finished before test exits so we should not hit the issue when we trie to log after test exits.
New changes are detected. LGTM label has been removed.
I'm recommending the above code as a utility that can be used in more than one test.
The example() is just an example of how it would be used.
I recommend any of these test related sync utilities be pulled out into a utilities package rather than rewritten for every use case.
/retest
/retest
/retest
/hold sth is not working 🤔
/unhold
@bowei I fix the test with reccomended approach, PTAL
Please take a look at the comments...
Parallel executor tests checks that when timeout occurs parallel executor will exit even if long lasting operation scheduled in separate go routine is running. This go routine might finish after the whole test is finished and this is expected behaviour. In this go routine we should not use t.Log which will be invoked at the end of the action because the test might already exit and this can cause panic. https://screenshot.googleplex.com/3tRWo69XkSfvacf