golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.95k stars 17.66k forks source link

runtime: expose faketime for testing on normal architectures #22549

Open whyrusleeping opened 7 years ago

whyrusleeping commented 7 years ago

The faketime 'secret feature' that gets used by the playground is actually really interesting to use for writing tests that normally depend on time. For example, go-ethereum uses it to run network testing simulations reproducibly

The downside, is that its hacky to enable, and it only works on nacl. Is there any way we can get this exposed on other architectures?

cc @fjl and @karalabe

bradfitz commented 7 years ago

This isn't an interface we want to guarantee & support long-term in its current form.

whyrusleeping commented 7 years ago

please?

ianlancetaylor commented 7 years ago

There are a number of fake clock implementations in third_party packages. If you want one in the standard library, you need to tell us what it does that can not be done in a different package.

fjl commented 7 years ago

When I wrote the thing that @whyrusleeping linked above, I tried various fake clock implementations. There are multiple issues with them:

ianlancetaylor commented 7 years ago

How would exposing the runtime's fake clock solve any of those problems? What would the API be?

fjl commented 7 years ago

I understand that runtime.faketime was added to support the Go Playground. There are good reasons for not exposing it because there are no guarantees beyond what what the playground needs.

Maybe it's best to talk about what we want to achieve, rather than talking about how to expose the existing fake clock as is. What I wanted at the time was a way to run my program where:

1) Executing Go code takes zero (virtual) time. 2) All goroutines run until they're either blocked on synchronization or voluntarily go to sleep. 3) Once they're all blocked/asleep, (virtual) time advances to the wakeup time of the closest timer. 4) goto 2

We used runtime.faketime to perform a simulation of our peer-to-peer network protocol and it seemed to provide this mode of execution. The fact that it really only works with NaCl was limiting because at most 4GB of memory can be used for the simulation.

I think it would nice to have 'official' support for fake time because code under test could just use the normal clock and timers as provided by package time, but run with virtual time for testing/simulation purposes. Making this a NaCl-only feature is reasonable because it's an isolated environment without any access to the OS.

whyrusleeping commented 6 years ago

The key here is that we don't skip forward in time to unblock sleepers until all goroutines are blocked (sleeping, or waiting on channels). The only way to do this outside of exposing this runtime.faketime code would be to wrap all of our code in extra logic to inform our custom scheduler when we're blocking (meaning replacing all channel send/recvs with myscheduler.ChanSend(ch, val) and mysheduler.ChanRecv(ch) ) in order to guarantee that things happen in the right order.

My primary usecase here is testing code that relies on timing between different actors. Currently, these tests take a very long time because i'm trying to use 'close to real' values for the timeouts and delays. When using these 'close to real' values, the code runs and works as expected. But if I ratchet down those durations, then fluctuations in the scheduler and actual runtime costs make the simulation less accurate.

One way to satisfy my needs would be to add a callback that I can set for when all goroutines are blocked (instead of the current behavior where it panics). Then i could implement my scheduler such that all sleep calls just block on a channel, and when that callback hits, i roll forward time until i hit the first deadline set by one of those sleeps, and unblock that sleep.

whyrusleeping commented 6 years ago

something like:

runtime.SetDeadlockHandler(func() {
    myscheduler.AdvanceTime()
})

And then in runtime/proc.go, near the end of checkdead() we could add a check to see if there is a handler set, call it if it is, and then recheck if we're deadlocked.