Closed Emyrk closed 1 year ago
Hi @Emyrk thanks for the ticket. Are you seeing multiple goroutines being created for timeouts? Or just the one maintenance goroutine?
If you use timeouts the design has a single ongoing goroutine to maintain the timeouts. That goroutine should exist for the longest timeout set by the user (even if the regex finishes before the timeout). I don’t consider that a leak because the resources are fixed and knowable up front. However, if you’re seeing something different then that’s a definitely a problem I want to fix.
I am only seeing the 1 go routine.
The issue I have, is that we are using a highlighting lib called Chroma. We have been using it for awhile. This this new go routine, our unit tests fail because we use https://github.com/uber-go/goleak which requires all go routines to be exited at the end of the unit tests.
Unfortunately I have no control over Chroma, which uses regexp2. So I am unable to make sure this go routine is exited at the end of the unit test.
I recognize the benefit of this feature, and just wanted to start a conversation if this can be resolved in a clean way.
I’m definitely open to ideas. In concept the issue we had was that the overhead in tracking running/completed regexs was higher than the overhead of just letting the goroutine stick around and wake up and check for a timeout.
For production scenarios this trade off is worth it, but unit tests are obviously different. Maybe I could add an opt-in package mode for unit testing that changes the timeout tracking behavior to shutdown faster (and consume more resources). It wouldn’t work for benchmarks, but I suspect that’s ok.
Yea, I'm internally struggling with what approach to take. In production, I have no issue with this.
A package mode is an interesting idea. It is unfortunate we'd have to call something in our unit tests, and it wouldn't be transparent. As in, you have to catch the leak, look into why, find this toggle.
Is it possible to know "no more matches are in progress"? And if it is a matter of the wakeup time, could that be configurable? Then it can be used for unit tests, but also just a package setting.
package stuff_test
import "github.com/dlclark/regexp2"
func init() {
// Pass in some custom "wake up" check time? Rather than default 100ms?
regexp2.SetClockSleep(time.Millisecond)
}
Ok, so I've coded up a possible solution here. Before I publish it I figure I'd get validation this would fix your issue.
I assume you have one or more unit tests that have defer goleak.VerifyNone(t)
at the top. In those tests you would add a defer regexp2.StopTimeoutClock()
and it'll make sure the timeout goroutine is done before it returns.
This will likely add ~100ms to each test that does this. If that time is too long then you could also add an init function to set the timeout period to 1ms:
func init() {
//speed up testing by making the timeout clock 1ms
regexp2.SetTimeoutCheckPeriod(time.Millisecond)
}
Thoughts?
I am pretty sure that will work!
Goleak is to be run on TestMain(m *testing.M)
, so we only need to put this in that one spot. This means the StopTimeoutClock
will be called when all unit tests in the package are complete, and no other code will interact with that clock until the next unit test package starts.
Appreciate you hearing this out.
With the introduction of fastclock, it spawns a go routine with a given timeout.
https://github.com/dlclark/regexp2/blob/master/fastclock.go#L75
This timeout is defaulted to "forever".
https://github.com/dlclark/regexp2/blob/master/regexp.go#L22-L32
If you are using any unit tests, this can leak if using uber-go/goleak.
I am using Chroma which sets the timeout to 250ms, which is better than never, but it still leaks a routine on my quicker tests.
I do not know the solution, but can a way be implemented to make sure this go routine is killed when it is no longer needed? Could we store the number of
Matches
that is using the clock, and when the matches all go away, the go routine stops as soon as it can?As someone who is new to this repo, I am not 100% sure. It is just a problem we are hitting now in our unit tests.