Open jasonyb opened 2 years ago
Has this problem been resolved?
Has this problem been resolved?
I see the code has been touched since then by cb74430b79d9493fc3d23b1496cd4d7f0c4c41f1 but the issue remains. In case it wasn't clear, this is a test-only issue.
Jira Link: DB-3418
Description
I found external mini cluster does not properly set replication_factor. There's old code
but that gflag only affects the test process, not the daemon processes. The daemons get default rf 3 even when there's 1 master. This can cause any newly written tests using one master and tserver to fail with
when trying to do a postgres query (maybe generally any transactional query). This is because catalog manager relies on RF to determine how many tservers to wait for before creating the transaction status tablet:
There are some existing tests that have 1 master but have been running with the unexpected rf 3 the whole time. For example,
PgWrapperTestBase::GetNumMasters
returns 1, and this is the base test for most PG C++ tests. Fixing the code to set replication_factor flag on the master based on num_masters can change the replication factor on a lot of tests. If the change appears to have any bad side effect, that test can have a replication factor explicitly set.