cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
30.19k stars 3.82k forks source link

Custom Go Runtime with Lightweight Scheduler Randomization #105162

Open srosenberg opened 1 year ago

srosenberg commented 1 year ago

In support of -race, Go runtime implements a fairly inexpensive, random shuffling [1], [2], [3] in the (goroutine) scheduler,

In CL for [1], Dmitry's comment [4] proposes a non-local perturbation by shuffling the global (runnable) queue. He argues that the implemented change is fairly deterministic, yet, his idea didn't get implemented afaict,

Yes, it is quite deterministic. And if a thread is scheduled most likely it won't be preempted in the next 20ms at least. And also if two threads start at the same time, but one runs 10ms till the reordering point and the other runs for 1000ms till the reordering point, the chances that these points will almost always executed in the same order. That's why aggressive reordering is necessary.

Since running end-to-end tests (i.e., roachtest) with -race might not be feasible (owing to the perf. overhead), a lightweight (global) randomization inside the scheduler could facilitate exposing (data) race bugs in roachtest, without the overhead of TSan.

We already have the infrastructure to create custom builds [5], [6]. Thus, all that's needed is a patch to be applied in CI; roachtests can then be staged with the custom binary to enable runtime fuzzing.

[1] https://github.com/golang/go/issues/11372 [2] https://github.com/golang/go/blob/cd6d225bd30608544ecf4a3e5a7aa1d0607a66db/src/runtime/proc.go#L5953 [3] https://github.com/golang/go/blob/cd6d225bd30608544ecf4a3e5a7aa1d0607a66db/src/runtime/proc.go#L6005 [4] https://go-review.googlesource.com/c/go/+/11795/3#message-8a36b86dfffd37e7fe1a09d51f66d74985236697 [5] https://github.com/cockroachdb/cockroach/blob/76da6c7dfb4d71ad614715ec5e230a6cc76fbd0e/build/teamcity/internal/release/build-and-publish-patched-go/impl.sh#L60-L62 [6] https://github.com/cockroachdb/cockroach/blob/76da6c7dfb4d71ad614715ec5e230a6cc76fbd0e/build/teamcity/internal/release/build-and-publish-patched-go/impl-fips.sh#L6-L8

Jira issue: CRDB-28903

blathers-crl[bot] commented 1 year ago

cc @cockroachdb/test-eng

srosenberg commented 1 year ago

Related: https://github.com/golang/go/issues/43794