The latest nightly shows a timeout in fast/ConfigIncrementChangeCoordinators.toml. It reproduces with the following:
Commit: df2c1374cb923e8da5aa9949839ef62ee0d36b91
Seed: 2893897391
Buggify: on
It appears the test run is getting stuck while moving the cstate at https://github.com/apple/foundationdb/blob/772a9ab9fc1800d7dfaacb38dcf94ec41a9b7c3b/fdbserver/CoordinatedState.actor.cpp#L350-L351. All old coordinators have their configuration nodes locked successfully, but a majority of ForwardRequest replies are never received. A recovery takes place right at this instant, and at this point the database is in an unhealthy state. The configuration nodes are all locked which prevent any further configuration database transactions, causing the test timeout.
The latest nightly shows a timeout in
fast/ConfigIncrementChangeCoordinators.toml
. It reproduces with the following:Commit:
df2c1374cb923e8da5aa9949839ef62ee0d36b91
Seed:2893897391
Buggify:on
It appears the test run is getting stuck while moving the cstate at https://github.com/apple/foundationdb/blob/772a9ab9fc1800d7dfaacb38dcf94ec41a9b7c3b/fdbserver/CoordinatedState.actor.cpp#L350-L351. All old coordinators have their configuration nodes locked successfully, but a majority of
ForwardRequest
replies are never received. A recovery takes place right at this instant, and at this point the database is in an unhealthy state. The configuration nodes are all locked which prevent any further configuration database transactions, causing the test timeout.