Open ShlomiBalalis opened 2 years ago
Also, since as of 3.0 the repair takes a lot less time to run, I ran the same scenario with a larger data size, so that the repair will take 2 minutes again (roughly the same time as it did in the 2.6 run) but the repair succeeeds all the same:
< t:2022-05-10 13:54:40,527 f:base.py l:142 c:RemoteLibSSH2CmdRunner p:DEBUG > Command "sudo sctool repair -c 0e90fbc1-a3b1-4482-97eb-1e76b6f967f0 --fail-fast" finished with status 0
repair/a5725013-7b3d-43ce-8dd9-d3776d8b0999
After the encryption was activated:
< t:2022-05-10 14:01:50,853 f:cli.py l:1056 c:sdcm.mgmt.cli p:DEBUG > Issuing: 'sctool -c 0e90fbc1-a3b1-4482-97eb-1e76b6f967f0 progress repair/a5725013-7b3d-43ce-8dd9-d3776d8b0999'
Run: bc8ea0de-d068-11ec-b262-0a4fbbd9ca01
Status: DONE
Start time: 10 May 22 13:54:40 UTC
End time: 10 May 22 13:56:41 UTC
Duration: 2m1s
Progress: 100%
Datacenters:
- eu-west
╭───────────────────────────────┬────────────────────────────────┬──────────┬──────────╮
│ Keyspace │ Table │ Progress │ Duration │
├───────────────────────────────┼────────────────────────────────┼──────────┼──────────┤
│ keyspace1 │ standard1 │ 100% │ 19s │
├───────────────────────────────┼────────────────────────────────┼──────────┼──────────┤
│ simplestrategy_keyspace │ example_table │ 100% │ 0s │
├───────────────────────────────┼────────────────────────────────┼──────────┼──────────┤
│ system_auth │ role_attributes │ 100% │ 0s │
│ system_auth │ role_members │ 100% │ 0s │
│ system_auth │ roles │ 100% │ 0s │
├───────────────────────────────┼────────────────────────────────┼──────────┼──────────┤
│ system_distributed_everywhere │ cdc_generation_descriptions_v2 │ 100% │ 1s │
├───────────────────────────────┼────────────────────────────────┼──────────┼──────────┤
│ system_distributed │ cdc_generation_timestamps │ 100% │ 1s │
│ system_distributed │ cdc_streams_descriptions_v2 │ 100% │ 0s │
│ system_distributed │ service_levels │ 100% │ 0s │
│ system_distributed │ view_build_status │ 100% │ 0s │
├───────────────────────────────┼────────────────────────────────┼──────────┼──────────┤
│ system_traces │ events │ 100% │ 0s │
│ system_traces │ node_slow_log │ 100% │ 0s │
│ system_traces │ node_slow_log_time_idx │ 100% │ 0s │
│ system_traces │ sessions │ 100% │ 0s │
│ system_traces │ sessions_time_idx │ 100% │ 0s │
╰───────────────────────────────┴────────────────────────────────┴──────────┴──────────╯
The last test function in the sanity test goes as follows:
Here is a proper working run in the ipv6 sanity of 2.6:
sctool repair -c 39bc1f68-6fa6-4b54-93ad-802d6ef58061 --fail-fast
repair/7b236f58-abb8-47a0-8032-73257ca8f8be
After we enable client encrypt throughout the cluster:(Logs of the run):
In 3.0, oddly enough, (where the test uses the same data size) the repair seemingly ends before the encryption is activated:
So, I created a manual scenario where we create the task before the encryption but only start it after the encryption is on:
< t:2022-05-09 17:27:03,592 f:cli.py l:1056 c:sdcm.mgmt.cli p:DEBUG > Issuing: 'sctool repair -c c4f0ef1c-8a0e-4e3b-beec-5c91738c18af --fail-fast --cron '35 17 * * *' '
repair/00cbc57d-d92a-476b-90af-96ff8a9d7a9c
At this point we activate the encryption across the cluster, and then start the repair (From a node's log)< t:2022-05-09 17:34:08,795 f:cli.py l:1056 c:sdcm.mgmt.cli p:DEBUG > Issuing: 'sctool start repair/00cbc57d-d92a-476b-90af-96ff8a9d7a9c -c c4f0ef1c-8a0e-4e3b-beec-5c91738c18af'
How can the repair run when the client encryption is on?
Logs:
It's imporatnt to note, the repair only suceeds in the ipv6 sanity. In any other sanity the repair fails all the same. An example from the ipv4 centos sanity of 3.0: