cockroachdb / cockroach

CockroachDB - the open source, cloud-native distributed SQL database.
https://www.cockroachlabs.com
Other
29.52k stars 3.7k forks source link

server: investigate why TestStatusGetFiles does not work with a shared process virtual cluster #112956

Open herkolategan opened 8 months ago

herkolategan commented 8 months ago

Describe the problem

When running the test with a shared process virtual cluster:

--- FAIL: TestStatusGetFiles (5.15s)
=== RUN   TestStatusGetFiles/heap
    files_test.go:70: rpc error: code = Unimplemented desc = dump directory not configured: HEAP
    --- FAIL: TestStatusGetFiles/heap (0.00s)

=== RUN   TestStatusGetFiles/goroutines
    files_test.go:115: rpc error: code = Unimplemented desc = dump directory not configured: GOROUTINES
    --- FAIL: TestStatusGetFiles/goroutines (0.00s)

=== RUN   TestStatusGetFiles/path_separators
    files_test.go:144: GetFiles: path separators allowed in pattern
    --- FAIL: TestStatusGetFiles/path_separators (0.00s)

How to reproduce

Set the following in TestServerArgs to force a shared process tenant and cause the failure to happen.

DefaultTestTenant: base.SharedTestTenantAlwaysEnabled,

Expected behavior The test should pass when running with a shared process virtual cluster, unless after investigation the test is expected not to work with one.

Ref: https://github.com/cockroachdb/cockroach/issues/112857 Epic CRDB-38970

Jira issue: CRDB-32693

xinhaoz commented 7 months ago

Looks like the issue is with these lines in makeSharedProcessTenantServerConfig. The status server GetFiles knows which directory the goroutine dumps are in from the sql server config's goroutinedumpdir.

That's set in makeSharedProcessTenantServerConfig, but it looks like the non-empty value is overwritten. First it's set to the kv server config's values here, and then ~10 lines later, it's set to empty string here.

As for which one is correct, it looks like the first link was added more recently in this commit specifically to fix this scenario, but perhaps it missed deleting the second set of lines. I'm hesitant to say it was missed off the bat though, and I'm sure there may be reasons as to why we need to set those dirs to the empty string, so we should see if anything else breaks when deleting those lines.