microsoft / CCF

Confidential Consortium Framework
https://microsoft.github.io/CCF/
Apache License 2.0
766 stars 207 forks source link

Raft: Nodes stay active later during reconfiguration to ensure liveness #5973

Closed achamayou closed 3 months ago

achamayou commented 4 months ago

Update the retirement logic, making use of #5962 and #6018, to fix a liveness issue in some reconfiguration situations. See the rcfg_01_12 and rcfg_01_timeout_12 scenarios in this PR for reproduction samples.

The gist of this change is that nodes remain active for longer in retirement, despite no longer being counted or counting themselves in election or persistence quorums. Leaders only step down once they have confirmed the new configuration can make progress without nodes in the old configuration to minimise the number of elections.

A node that is not retired (Active, N/A) can:

  1. Append entries if leader
  2. Run for election
  3. Ack entries
  4. Vote in elections

A node that has seen its retirement signed (Retired, Ordered) can:

  1. Append entries if leader *[Changed, previously not allowed [](https://github.com/microsoft/CCF/blob/443309e10dff61385f758ecc5d7f932570704d27/src/consensus/aft/raft.h#L900)]**
  2. Run for election *[Changed, previously not allowed [](https://github.com/microsoft/CCF/blob/443309e10dff61385f758ecc5d7f932570704d27/src/consensus/aft/raft.h#L1959)]**
  3. Ack entries [Changed, previously not allowed past retirement_committable_idx]
  4. Vote in elections

A node that has seen its retirement committed (Retired, Completed) can:

  1. Append entries if leader *[Changed, previously not allowed [](https://github.com/microsoft/CCF/blob/443309e10dff61385f758ecc5d7f932570704d27/src/consensus/aft/raft.h#L900)]**
  2. Run for election *[Changed, previously not allowed [](https://github.com/microsoft/CCF/blob/443309e10dff61385f758ecc5d7f932570704d27/src/consensus/aft/raft.h#L1959)]**
  3. Ack entries [Changed, previously not allowed past retirement_committable_idx]
  4. Vote in elections

Such a node does not count itself for persistence and election quorums however, and no longer keeps track of the configuration it was once part of.

A node that has seen its retirement committed committed (Retired, RetiredCommitted) can:

  1. Ack entries [Changed, previously not allowed past retirement_committable_idx]
  2. Vote in elections

In addition to these changes, nodes now continue to keep track of other nodes whose retirement was committed in all_other_nodes, until their retirement committed is committed, and to send them AEs.

This change does not contain any renaming, which I think is best done separately, but is necessary. The retirement phase enum is really confusingly named at this point and can surely be simplified.

I have made corresponding changes in the spec, so that trace validation and simulation work, but the model checking has become much slower as a result. I am trying to address that, input welcome.

ccf-bot commented 4 months ago

update_retirement_behaviour@82669 aka 20240315.6 vs main ewma over 20 builds from 82315 to 82660

Click to see table main | build_id | build_number | tlc_sim_traces | tlc_sim_levelmean | Commit latency factor | tpcc_sgx_cft^ | tpcc_sgx_cft_mem | ls_sgx_cft^ | ls_sgx_cft_mem | pi_ls_sgx_cft^ | pi_ls_sgx_cft_mem | pi_basic_sgx_cft^ | pi_basic_sgx_cft_mem | pi_basic_js_sgx_cft^ | pi_basic_js_sgx_cft_mem | ls_jwt_sgx_cft^ | ls_jwt_sgx_cft_mem | pi_ls_jwt_sgx_cft^ | pi_ls_jwt_sgx_cft_mem | ls_js_sgx_cft^ | ls_js_sgx_cft_mem | pi_basic_mt_virtual_cft^ | ls_full_js_sgx_cft^ | ls_full_js_sgx_cft_mem | ls_js_jwt_sgx_cft^ | ls_js_jwt_sgx_cft_mem | hist_sgx_cft^ | RB put (/s)^ | CHAMP put (/s)^ | RB get (/s)^ | CHAMP get (/s)^ | pi_basic_mt_sgx_cft^ | pi_basic_mt_sgx_cft_mem | tpcc_virtual_cft^ | ls_virtual_cft^ | pi_ls_virtual_cft^ | pi_basic_virtual_cft^ | pi_basic_js_virtual_cft^ | ls_jwt_virtual_cft^ | pi_ls_jwt_virtual_cft^ | ls_js_virtual_cft^ | ls_full_js_virtual_cft^ | ls_js_jwt_virtual_cft^ | KV ser (/s)^ | KV deser (/s)^ | KV snap ser (/s)^ | KV snap deser (/s)^ | |-----------:|:---------------|-----------------:|--------------------:|------------------------:|----------------:|-------------------:|--------------:|-----------------:|-----------------:|--------------------:|--------------------:|-----------------------:|-----------------------:|--------------------------:|------------------:|---------------------:|---------------------:|------------------------:|-----------------:|--------------------:|---------------------------:|----------------------:|-------------------------:|---------------------:|------------------------:|----------------:|---------------:|------------------:|---------------:|------------------:|-----------------------:|--------------------------:|--------------------:|------------------:|---------------------:|------------------------:|---------------------------:|----------------------:|-------------------------:|---------------------:|--------------------------:|-------------------------:|-----------------:|-----------------:|--------------------:|----------------------:| | 82315 | 20240306.3 | 34133 | 375 | 0.804482 | 5618.78 | 8.59996e+07 | 14032.1 | 1.88908e+07 | 14218.5 | 1.05021e+07 | 15699.3 | 1.46964e+07 | 1377.2 | 1.25993e+07 | 6836.79 | 1.88908e+07 | 6931.8 | 6.30784e+06 | 5795.63 | 1.67936e+07 | 82959.5 | 5481.18 | 1.67936e+07 | 3995.34 | 1.67936e+07 | 41794.9 | 835789 | 1.18617e+06 | 8.16473e+06 | 3.06518e+07 | 28112.8 | 2.30851e+07 | 17468.7 | 53231.1 | 57121.3 | 61237 | 4417.9 | 20812.6 | 21930.3 | 18069.5 | 17417 | 11588.6 | nan | nan | nan | nan | | 82321 | 20240306.6 | 33702 | 377 | 0.785048 | 5629.75 | 8.59996e+07 | 14041.7 | 1.88908e+07 | 14151.6 | 1.05021e+07 | 15662.6 | 1.46964e+07 | 1373.3 | 1.25993e+07 | 6865.97 | 1.88908e+07 | 6964.6 | 6.30784e+06 | 5757.88 | 1.67936e+07 | 82724.1 | 5474.9 | 1.67936e+07 | 4002.67 | 1.67936e+07 | 44891.4 | 836242 | 1.17772e+06 | 8.15264e+06 | 3.07655e+07 | 28012.5 | 2.51822e+07 | 17426.5 | 52879.8 | 55433.3 | 60634.7 | 4389.5 | 20624.1 | 21800.7 | 17655.9 | 17270.1 | 11506 | nan | nan | nan | nan | | 82332 | 20240306.10 | 30994 | 379 | 0.79874 | 5620.65 | 8.59996e+07 | 13985.2 | 1.88908e+07 | 14110.8 | 1.05021e+07 | 15556.5 | 1.25993e+07 | 1372.4 | 1.25993e+07 | 6857.74 | 1.88908e+07 | 6946.2 | 6.30784e+06 | 5778.16 | 1.67936e+07 | 88945 | 5452.62 | 1.67936e+07 | 4002.45 | 1.67936e+07 | 43341.5 | 833759 | 1.18045e+06 | 8.15141e+06 | 3.07019e+07 | 28111.8 | 2.30851e+07 | 17295.2 | 53045.1 | 56284.1 | 61773.4 | 4417.8 | 20958.4 | 21114.9 | 17628.6 | 17485.5 | 11579.5 | nan | nan | nan | nan | | 82346 | 20240306.15 | 31794 | 382 | 0.822748 | 5532.28 | 8.59996e+07 | 13984.7 | 1.67936e+07 | 14123.9 | 1.05021e+07 | 15507.5 | 1.46964e+07 | 1357.1 | 1.25993e+07 | 6797.8 | 1.67936e+07 | 6880.8 | 6.30784e+06 | 5810.15 | 1.67936e+07 | 84614.2 | 5482.26 | 1.67936e+07 | 3999.43 | 1.67936e+07 | 46510.5 | 842469 | 1.1765e+06 | 8.15488e+06 | 3.16328e+07 | 28204.7 | 2.30851e+07 | 17297 | 53156.3 | 56558.2 | 62042.4 | 4437.1 | 21050.7 | 21811.7 | 17575.8 | 17148.5 | 11647.7 | nan | nan | nan | nan | | 82358 | 20240306.20 | 29926 | 373 | 0.824282 | 5602.55 | 8.59996e+07 | 13998.3 | 1.88908e+07 | 14156 | 1.05021e+07 | 15615.3 | 1.25993e+07 | 1368.6 | 1.25993e+07 | 6836.5 | 1.88908e+07 | 6958.5 | 6.30784e+06 | 5781.81 | 1.67936e+07 | 68591.9 | 5438.22 | 1.67936e+07 | 3987.94 | 1.67936e+07 | 45499.4 | 818328 | 1.16486e+06 | 8.15589e+06 | 3.29276e+07 | 28080.3 | 2.51822e+07 | 17287.7 | 52787.3 | 56772.1 | 62175 | 4443.6 | 20635.9 | 21556.4 | 17896.2 | 17100.2 | 11641.7 | 1.02754e+06 | 1.12689e+06 | 7796.25 | 1400.78 | | 82372 | 20240307.1 | 30068 | 384 | 0.798002 | 5590.26 | 8.59996e+07 | 14010.1 | 1.88908e+07 | 14116.1 | 1.05021e+07 | 15590 | 1.25993e+07 | 1375 | 1.25993e+07 | 6865.1 | 1.88908e+07 | 6975.9 | 6.30784e+06 | 5800.08 | 1.67936e+07 | 79631.9 | 5440.03 | 1.67936e+07 | 3991.31 | 1.67936e+07 | 46623.6 | 835657 | 1.18166e+06 | 8.16509e+06 | 3.06216e+07 | 27854.3 | 2.30851e+07 | 17243.6 | 53319.1 | 56475.5 | 61647.9 | 4379.6 | 20880.9 | 21834.1 | 17540.1 | 17040.5 | 11606.2 | 1.04026e+06 | 1.11682e+06 | 8057.08 | 1405.32 | | 82392 | 20240307.9 | 28161 | 378 | 0.787063 | 5555.63 | 8.59996e+07 | 13954.8 | 1.88908e+07 | 14096.3 | 1.05021e+07 | 15535.1 | 1.46964e+07 | 1352 | 1.25993e+07 | 6866.11 | 1.88908e+07 | 7040.7 | 6.30784e+06 | 5750.37 | 1.67936e+07 | 90981.8 | 5489.26 | 1.67936e+07 | 3986.25 | 1.67936e+07 | 45458.9 | 834772 | 1.18389e+06 | 8.15329e+06 | 3.08838e+07 | 28322 | 2.30851e+07 | 17187.9 | 52909.2 | 57406.9 | 62445.4 | 4411.1 | 20940.2 | 22262.8 | 17671.6 | 17636.1 | 11746.4 | 1.05119e+06 | 1.12956e+06 | 7843.39 | 1403.5 | | 82407 | 20240308.2 | 29743 | 372 | 0.830629 | 5654.78 | 8.59996e+07 | 14012.2 | 1.88908e+07 | 14138.1 | 1.05021e+07 | 15672.6 | 1.46964e+07 | 1364.9 | 1.05021e+07 | 6853.4 | 1.88908e+07 | 6925 | 6.30784e+06 | 5767.26 | 1.67936e+07 | 90235.1 | 5480.12 | 1.67936e+07 | 3994.03 | 1.67936e+07 | 45953.3 | 836421 | 1.17254e+06 | 8.13163e+06 | 3.07558e+07 | 27859.2 | 2.72794e+07 | 17359.4 | 53373.2 | 56324.1 | 61935.7 | 4429.2 | 20644.8 | 21588.5 | 17541.9 | 17446.5 | 11719.9 | 1.02103e+06 | 1.17069e+06 | 8124.79 | 1402.61 | | 82441 | 20240311.1 | 38449 | 376 | 0.818209 | 5572.44 | 8.59996e+07 | 13983.1 | 1.88908e+07 | 13370.1 | 1.05021e+07 | 15453.9 | 1.25993e+07 | 1358.8 | 1.25993e+07 | 6870.14 | 1.88908e+07 | 7077.5 | 6.30784e+06 | 5790.64 | 1.67936e+07 | 96888.6 | 5456.74 | 1.67936e+07 | 3985.54 | 1.67936e+07 | 45635 | 828598 | 1.18512e+06 | 8.153e+06 | 3.02507e+07 | 28099.2 | 2.30851e+07 | 17288.1 | 53367.8 | 56506.8 | 62100.9 | 4417.4 | 20803.9 | 21587.1 | 17464.1 | 17137.7 | 11747.7 | 1.03584e+06 | 1.16374e+06 | 7514.24 | 1398.31 | | 82456 | 20240311.7 | 39343 | 379 | 0.839718 | 5496.55 | 8.59996e+07 | 13992.5 | 1.67936e+07 | 14138.2 | 1.05021e+07 | 15452.5 | 1.25993e+07 | 1359.7 | 1.25993e+07 | 7197.94 | 1.67936e+07 | 6906.8 | 6.30784e+06 | 5798.3 | 1.67936e+07 | 75238.8 | 5445.48 | 1.67936e+07 | 3987.39 | 1.67936e+07 | 40420.9 | 835081 | 1.17571e+06 | 8.12998e+06 | 3.11379e+07 | 28204.3 | 2.30851e+07 | 17270.2 | 52929.5 | 40908.3 | 62552.9 | 4445.4 | 20602.8 | 22072.1 | 17369.4 | 17229.3 | 11525.8 | 1.0003e+06 | 1.1489e+06 | 7738.78 | 1400.32 | | 82473 | 20240312.1 | 39294 | 382 | 0.805266 | 5587.27 | 8.59996e+07 | 13996.9 | 1.88908e+07 | 14196.5 | 1.05021e+07 | 15567.6 | 1.46964e+07 | 1368.9 | 1.25993e+07 | 6854.5 | 1.67936e+07 | 7000.1 | 6.30784e+06 | 5790.57 | 1.67936e+07 | 94835.1 | 5461.32 | 1.67936e+07 | 3996.37 | 1.67936e+07 | 45755.4 | 834082 | 1.17499e+06 | 8.15056e+06 | 3.06353e+07 | 27888.1 | 2.51822e+07 | 17246.9 | 52699.4 | 56658.8 | 62660.2 | 4303.5 | 21030.7 | 21444.5 | 17406.2 | 17509.8 | 11603.4 | 1.0445e+06 | 1.14247e+06 | 7854 | 1405.81 | | 82502 | 20240312.11 | 40629 | 382 | 0.829827 | 5635.38 | 8.59996e+07 | 14074.4 | 1.67936e+07 | 14154 | 1.05021e+07 | 15695 | 1.25993e+07 | 1377.7 | 1.25993e+07 | 7212.08 | 1.67936e+07 | 7045.2 | 6.30784e+06 | 5796.65 | 1.67936e+07 | 67917.8 | 5481.05 | 1.67936e+07 | 3993.69 | 1.67936e+07 | 42603.8 | 834662 | 1.17789e+06 | 8.15008e+06 | 3.1622e+07 | 28411.5 | 2.30851e+07 | 17181.6 | 52840.8 | 56306.6 | 60613.1 | 4403.8 | 20810.3 | 21757.1 | 17364.2 | 17169.1 | 11736.7 | 1.02249e+06 | 1.11421e+06 | 8295.97 | 1399.08 | | 82521 | 20240313.2 | 40274 | 372 | 0.802259 | 5630.28 | 8.59996e+07 | 14016.3 | 1.67936e+07 | 14157.1 | 1.05021e+07 | 15625.3 | 1.46964e+07 | 1370.8 | 1.25993e+07 | 7221.85 | 1.67936e+07 | 6967.1 | 6.30784e+06 | 5805.73 | 1.67936e+07 | 76117 | 5440.36 | 1.67936e+07 | 4000.28 | 1.67936e+07 | 43377.2 | 830900 | 1.1784e+06 | 8.13131e+06 | 3.11691e+07 | 28015.2 | 2.30851e+07 | 17221.6 | 52968.6 | 55930.6 | 61014 | 4425.8 | 21086.2 | 21616.5 | 17381.5 | 17146.1 | 11718.2 | 1.0381e+06 | 1.07747e+06 | 7780.75 | 1400.44 | | 82561 | 20240313.15 | 38781 | 371 | 0.826024 | 5556.37 | 8.59996e+07 | 13920.3 | 1.88908e+07 | 14042.1 | 1.05021e+07 | 15390.3 | 1.46964e+07 | 1354.4 | 1.25993e+07 | 6798.93 | 1.88908e+07 | 6821.4 | 6.30784e+06 | 5783.72 | 1.67936e+07 | 87765.6 | 5479.15 | 1.67936e+07 | 3970.22 | 1.67936e+07 | 43173.6 | 831057 | 1.17861e+06 | 8.15024e+06 | 3.06293e+07 | 28184.8 | 2.51822e+07 | 17277.8 | 53059.5 | 56330.7 | 62396.4 | 4435.2 | 20981.1 | 21688.2 | 20773 | 17387.5 | 11678.1 | 1.0686e+06 | 1.16496e+06 | 7967.99 | 1398.97 | | 82592 | 20240314.2 | 38555 | 373 | 0.803179 | 5591.23 | 8.59996e+07 | 13986.3 | 1.88908e+07 | 14075.3 | 1.05021e+07 | 15603.2 | 1.25993e+07 | 1368 | 1.25993e+07 | 6825.51 | 1.88908e+07 | 6927.3 | 6.30784e+06 | 5791.99 | 1.67936e+07 | 89283.3 | 5445.85 | 1.67936e+07 | 3996.47 | 1.67936e+07 | 46713 | 817299 | 1.18123e+06 | 8.12576e+06 | 3.11341e+07 | 28159 | 2.30851e+07 | 17520.4 | 53231.3 | 56074.6 | 60237.6 | 4376.3 | 20914.1 | 22366.9 | 17567.3 | 17621.7 | 11732.7 | 1.05496e+06 | 1.15902e+06 | 7953.59 | 1402.1 | | 82609 | 20240314.9 | 38112 | 381 | 0.842778 | 5524.91 | 8.59996e+07 | 14014.6 | 1.67936e+07 | 14135.5 | 1.05021e+07 | 15615.2 | 1.46964e+07 | 1366.1 | 1.25993e+07 | 6842.49 | 1.67936e+07 | 7080.4 | 6.30784e+06 | 5793.47 | 1.67936e+07 | 85734 | 5469.72 | 1.67936e+07 | 3976.63 | 1.67936e+07 | 47474.2 | 833654 | 1.18237e+06 | 8.16864e+06 | 3.02101e+07 | 28014.3 | 2.30851e+07 | 17523.4 | 53248.3 | 56867.9 | 63565.8 | 4425.6 | 20628.4 | 22031.3 | 17455.4 | 17418.7 | 11703 | 1.00746e+06 | 1.08319e+06 | 8424.96 | 1404.01 | | 82620 | 20240314.12 | 39491 | 379 | 0.821359 | 5600 | 8.59996e+07 | 14034.6 | 1.88908e+07 | 14133 | 1.05021e+07 | 15635.1 | 1.25993e+07 | 1369.7 | 1.25993e+07 | 6906.12 | 1.88908e+07 | 7082.2 | 6.30784e+06 | 5794.44 | 1.67936e+07 | 79508.6 | 5462.66 | 1.67936e+07 | 3985.85 | 1.67936e+07 | 47389.9 | 835100 | 1.18237e+06 | 8.13644e+06 | 3.07632e+07 | 28220.4 | 2.51822e+07 | 17148.3 | 53103.1 | 56239.8 | 62026.5 | 4421 | 21105.2 | 22200.5 | 17605.9 | 17370.4 | 11726.6 | 1.0269e+06 | 1.19674e+06 | 8035.91 | 1400 | | 82628 | 20240314.15 | 38033 | 376 | 0.792583 | 5584.46 | 8.59996e+07 | 14030.7 | 1.88908e+07 | 14081 | 1.05021e+07 | 15558.6 | 1.25993e+07 | 1373.6 | 1.25993e+07 | 7217 | 1.67936e+07 | 6956.3 | 6.30784e+06 | 5777.25 | 1.67936e+07 | 97914.9 | 5434.19 | 1.67936e+07 | 3994.66 | 1.67936e+07 | 43008.5 | 834807 | 1.17971e+06 | 8.14343e+06 | 3.08243e+07 | 28262.2 | 2.30851e+07 | 17217.6 | 53209.4 | 56533.3 | 60980.9 | 4357.8 | 21035.9 | 22036.6 | 17867.8 | 17344.8 | 11569.9 | 1.02051e+06 | 1.15354e+06 | 7868.46 | 1402.68 | | 82654 | 20240315.2 | 39409 | 373 | 0.814514 | 5618.03 | 8.59996e+07 | 13992.7 | 1.88908e+07 | 14158.4 | 1.05021e+07 | 15569.9 | 1.46964e+07 | 1370.4 | 1.25993e+07 | 6855.54 | 1.67936e+07 | 7078.3 | 6.30784e+06 | 5809.14 | 1.67936e+07 | 75376 | 5471.9 | 1.67936e+07 | 3964.25 | 1.67936e+07 | 44123.6 | 835932 | 1.18162e+06 | 8.12334e+06 | 3.19915e+07 | 28314.8 | 2.72794e+07 | 17498.5 | 53258.9 | 57152.3 | 63228.6 | 4435.5 | 20753.7 | 21386.7 | 17614.6 | 17361.3 | 11712.2 | 997407 | 1.10023e+06 | 7593.71 | 1401.07 | | 82660 | 20240315.4 | 37881 | 370 | 0.79239 | 5655.25 | 8.59996e+07 | 14026.5 | 1.88908e+07 | 14117.6 | 1.05021e+07 | 15641.5 | 1.46964e+07 | 1368.7 | 1.25993e+07 | 6842.27 | 1.88908e+07 | 6972.1 | 6.30784e+06 | 5792.18 | 1.67936e+07 | 79337.2 | 5483.28 | 1.67936e+07 | 3976.83 | 1.67936e+07 | 40414.1 | 800286 | 1.18393e+06 | 8.15319e+06 | 3.06816e+07 | 28028.6 | 2.30851e+07 | 17278.6 | 52740.4 | 57193.2 | 62733.4 | 4440.8 | 20988.5 | 22202.5 | 20747.1 | 17454.8 | 11817.2 | 1.04406e+06 | 1.17137e+06 | 8002.43 | 1400.4 | update_retirement_behaviour | build_id | build_number | pi_basic_mt_sgx_cft^ | pi_basic_mt_sgx_cft_mem | Commit latency factor | tpcc_sgx_cft^ | tpcc_sgx_cft_mem | ls_sgx_cft^ | ls_sgx_cft_mem | pi_ls_sgx_cft^ | pi_ls_sgx_cft_mem | pi_basic_sgx_cft^ | pi_basic_sgx_cft_mem | pi_basic_mt_virtual_cft^ | tpcc_virtual_cft^ | pi_basic_js_sgx_cft^ | pi_basic_js_sgx_cft_mem | ls_virtual_cft^ | pi_ls_virtual_cft^ | pi_basic_virtual_cft^ | ls_jwt_sgx_cft^ | ls_jwt_sgx_cft_mem | pi_ls_jwt_sgx_cft^ | pi_ls_jwt_sgx_cft_mem | pi_basic_js_virtual_cft^ | ls_jwt_virtual_cft^ | ls_js_sgx_cft^ | ls_js_sgx_cft_mem | pi_ls_jwt_virtual_cft^ | ls_js_virtual_cft^ | ls_full_js_sgx_cft^ | ls_full_js_sgx_cft_mem | ls_full_js_virtual_cft^ | ls_js_jwt_virtual_cft^ | ls_js_jwt_sgx_cft^ | ls_js_jwt_sgx_cft_mem | hist_sgx_cft^ | KV ser (/s)^ | KV deser (/s)^ | KV snap ser (/s)^ | KV snap deser (/s)^ | RB put (/s)^ | CHAMP put (/s)^ | RB get (/s)^ | CHAMP get (/s)^ | tlc_sim_traces | tlc_sim_levelmean | |-----------:|:---------------|-----------------------:|--------------------------:|------------------------:|----------------:|-------------------:|--------------:|-----------------:|-----------------:|--------------------:|--------------------:|-----------------------:|---------------------------:|--------------------:|-----------------------:|--------------------------:|------------------:|---------------------:|------------------------:|------------------:|---------------------:|---------------------:|------------------------:|---------------------------:|----------------------:|-----------------:|--------------------:|-------------------------:|---------------------:|----------------------:|-------------------------:|--------------------------:|-------------------------:|---------------------:|------------------------:|----------------:|-----------------:|-----------------:|--------------------:|----------------------:|---------------:|------------------:|---------------:|------------------:|-----------------:|--------------------:| | 82586 | 20240313.21 | 27791.8 | 2.30851e+07 | 0.815448 | 5624.27 | 8.59996e+07 | 13970.8 | 1.88908e+07 | 14088.8 | 1.05021e+07 | 15480.1 | 1.46964e+07 | 75344.7 | 17131.6 | 1372 | 1.25993e+07 | 50332.8 | 54782.5 | 58243.2 | 6847.99 | 1.88908e+07 | 6878.9 | 6.30784e+06 | 4339.4 | 17863.3 | 5791.31 | 1.67936e+07 | 20935.8 | 17531.9 | 5438.32 | 1.67936e+07 | 17379.6 | 11649.6 | 3999.09 | 1.67936e+07 | 44477.2 | 1.01854e+06 | 1.1383e+06 | 7949.88 | 1399.11 | 831309 | 1.17756e+06 | 8.15137e+06 | 3.07198e+07 | 32199 | 369 | | 82625 | 20240314.14 | 27895.2 | 2.51822e+07 | 0.848417 | 5652.45 | 8.59996e+07 | 13939.9 | 1.88908e+07 | 14050.9 | 1.05021e+07 | 15459.3 | 1.46964e+07 | 68696.2 | 17204.4 | 1358.4 | 1.25993e+07 | 52822.1 | 55922.8 | 62061.9 | 6879.96 | 1.88908e+07 | 6999.4 | 6.30784e+06 | 4409.8 | 20807.7 | 5794.3 | 1.67936e+07 | 21969.8 | 20764.3 | 5473.13 | 1.67936e+07 | 17327.6 | 11771 | 3964.31 | 1.67936e+07 | 46544.4 | 1.06781e+06 | 1.18008e+06 | 8236.36 | 1404.36 | 833907 | 1.18003e+06 | 8.15183e+06 | 3.22013e+07 | 32145 | 377 | | 82642 | 20240314.18 | 28020 | 2.30851e+07 | 0.817649 | 5563.54 | 8.59996e+07 | 13932.3 | 1.67936e+07 | 14060.8 | 1.05021e+07 | 15510.3 | 1.25993e+07 | 81900.2 | 16813.9 | 1360.7 | 1.25993e+07 | 52974.2 | 55598.8 | 61473.2 | 6773.06 | 1.67936e+07 | 7154 | 6.30784e+06 | 4434.4 | 20606.1 | 5746.21 | 1.67936e+07 | 21143.4 | 17421.9 | 5482.42 | 1.67936e+07 | 17204.1 | 11796.4 | 3988.9 | 1.67936e+07 | 45304.8 | 1.03799e+06 | 1.14956e+06 | 7809.91 | 1405.44 | 827270 | 1.17953e+06 | 8.14353e+06 | 3.05184e+07 | 29536 | 373 | | 82669 | 20240315.6 | 28075.2 | 2.30851e+07 | 0.794926 | 5634.14 | 8.59996e+07 | 13947.9 | 1.88908e+07 | 14012 | 1.05021e+07 | 15539.6 | 1.46964e+07 | 70022 | 17589 | 1368.6 | 1.25993e+07 | 53057.1 | 55818.8 | 62280.7 | 6845.94 | 1.67936e+07 | 6921.4 | 6.30784e+06 | 4405.1 | 20891.9 | 5759.33 | 1.67936e+07 | 22022.1 | 17388.2 | 5439.72 | 1.67936e+07 | 17282.6 | 11689.1 | 3997.67 | 1.67936e+07 | 44076.3 | 983961 | 1.16252e+06 | 8091.98 | 1405.17 | 825044 | 1.18e+06 | 8.1315e+06 | 3.06706e+07 | 31081 | 385 |

images

achamayou commented 4 months ago

Notes for tomorrow: the TV for swap_node_no_election stumbles on this:

{"h_ts": "249", "thread_id": "100", "level": "debug", "tag": "raft_trace", "file": "/home/amchamay/CCF/src/consensus/aft/raft.h", "number": "2228", "msg": {"configurations": [{"idx": 3, "nodes": {"0": {"address": ":"}, "1": {"address": ":"}}, "rid": 3}, {"idx": 5, "nodes": {"1": {"address": ":"}, "2": {"address": ":"}}, "rid": 5}], "function": "commit", "state": {"commit_idx": 6, "committable_indices": [], "current_view": 2, "last_idx": 6, "leadership_state": "None", "membership_state": "Retired", "new_view_idx": 0, "node_id": "0"}}, "cmd": null}

That's a commit in leadership state "None", because by the time we log this we've already stepped down. That doesn't play well with AdvanceCommitIndex of course.

I think it may be necessary to update the trace logic here, but I'll think about other possible solutions.

achamayou commented 4 months ago

Notes for tomorrow: the TV for swap_node_no_election stumbles on this:

{"h_ts": "249", "thread_id": "100", "level": "debug", "tag": "raft_trace", "file": "/home/amchamay/CCF/src/consensus/aft/raft.h", "number": "2228", "msg": {"configurations": [{"idx": 3, "nodes": {"0": {"address": ":"}, "1": {"address": ":"}}, "rid": 3}, {"idx": 5, "nodes": {"1": {"address": ":"}, "2": {"address": ":"}}, "rid": 5}], "function": "commit", "state": {"commit_idx": 6, "committable_indices": [], "current_view": 2, "last_idx": 6, "leadership_state": "None", "membership_state": "Retired", "new_view_idx": 0, "node_id": "0"}}, "cmd": null}

That's a commit in leadership state "None", because by the time we log this we've already stepped down. That doesn't play well with AdvanceCommitIndex of course.

I think it may be necessary to update the trace logic here, but I'll think about other possible solutions.

Fixed in #5984

lemmy commented 3 months ago
lemmy@ccf-model-checking:~/ccf/tla/consensus$ git log -n1
commit 6a7c9b839c8400305eb1e57504faf57d72bccc03 (HEAD -> update_retirement_behaviour, achamayou/update_retirement_behaviour)
Author: Amaury Chamayou <amchamay@microsoft.com>
Date:   Fri Mar 8 14:29:09 2024 +0000

    Rename reconfiguration scenarios
lemmy@ccf-model-checking:~/ccf/tla/consensus$ cat tlc
#!/bin/bash

java \
-Dcom.sun.management.jmxremote \
-Dcom.sun.management.jmxremote.port=55449 \
-Dcom.sun.management.jmxremote.ssl=false \
-Dcom.sun.management.jmxremote.authenticate=false \
-Dtlc2.value.Values.width=140 \
-Dtlc2.tool.impl.Tool.cdot=true \
-Dtlc2.tool.queue.IStateQueue=DiskByteArrayQueue \
-Dtlc2.tool.fp.FPSet.impl=tlc2.tool.fp.OffHeapDiskFPSet \
-XX:StartFlightRecording=settings=default,disk=true,dumponexit=true,maxage=12h,filename=tlc.jfr \
-XX:+UnlockDiagnosticVMOptions \
-XX:+DebugNonSafepoints \
-XX:FlightRecorderOptions=stackdepth=256 \
-XX:+UseParallelGC \
-XX:MaxDirectMemorySize=128g \
-Xms32g \
-jar tla2tools.jar \
-workers auto \
-checkpoint 0 \
-nowarning \
-config MCccfraftWithReconfig.cfg \
MCccfraft.tla
lemmy@ccf-model-checking:~/ccf/tla/consensus$ ./tlc
[0.381s][info][jfr,startup] Started recording 1.
[0.381s][info][jfr,startup]
[0.381s][info][jfr,startup] Use jcmd 9379 JFR.dump name=1 to copy recording data to file.
TLC2 Version 2.18 of Day Month 20?? (rev: a422f06)
Running breadth-first search Model-Checking with fp 80 and seed -3707896216334801010 with 96 workers on 96 cores with 31403MB heap and 131072MB offheap memory [pid: 9379] (Linux 5.15.0-1058-azure amd64, Private Build 21.0.2 x86_64, OffHeapDiskFPSet, DiskByteArrayQueue).
Parsing file /home/lemmy/ccf/tla/consensus/MCccfraft.tla
Parsing file /home/lemmy/ccf/tla/consensus/ccfraft.tla
Parsing file /home/lemmy/ccf/tla/consensus/StatsFile.tla
Parsing file /home/lemmy/ccf/tla/consensus/MCAliases.tla
Parsing file /tmp/tlc-18063054040580557503/_TLCTrace.tla (jar:file:/home/lemmy/ccf/tla/consensus/tla2tools.jar!/tla2sany/StandardModules/_TLCTrace.tla)
Parsing file /tmp/tlc-18063054040580557503/Naturals.tla (jar:file:/home/lemmy/ccf/tla/consensus/tla2tools.jar!/tla2sany/StandardModules/Naturals.tla)
Parsing file /tmp/tlc-18063054040580557503/FiniteSets.tla (jar:file:/home/lemmy/ccf/tla/consensus/tla2tools.jar!/tla2sany/StandardModules/FiniteSets.tla)
Parsing file /tmp/tlc-18063054040580557503/Sequences.tla (jar:file:/home/lemmy/ccf/tla/consensus/tla2tools.jar!/tla2sany/StandardModules/Sequences.tla)
Parsing file /tmp/tlc-18063054040580557503/TLC.tla (jar:file:/home/lemmy/ccf/tla/consensus/tla2tools.jar!/tla2sany/StandardModules/TLC.tla)
Parsing file /tmp/tlc-18063054040580557503/FiniteSetsExt.tla (jar:file:/home/lemmy/ccf/tla/consensus/CommunityModules-deps.jar!/FiniteSetsExt.tla)
Parsing file /tmp/tlc-18063054040580557503/SequencesExt.tla (jar:file:/home/lemmy/ccf/tla/consensus/CommunityModules-deps.jar!/SequencesExt.tla)
Parsing file /tmp/tlc-18063054040580557503/Functions.tla (jar:file:/home/lemmy/ccf/tla/consensus/CommunityModules-deps.jar!/Functions.tla)
Parsing file /tmp/tlc-18063054040580557503/Folds.tla (jar:file:/home/lemmy/ccf/tla/consensus/CommunityModules-deps.jar!/Folds.tla)
Parsing file /tmp/tlc-18063054040580557503/Json.tla (jar:file:/home/lemmy/ccf/tla/consensus/tla2tools.jar!/tla2sany/StandardModules/Json.tla)
Parsing file /home/lemmy/ccf/tla/consensus/Network.tla
Parsing file /tmp/tlc-18063054040580557503/Bags.tla (jar:file:/home/lemmy/ccf/tla/consensus/tla2tools.jar!/tla2sany/StandardModules/Bags.tla)
Parsing file /tmp/tlc-18063054040580557503/TLCExt.tla (jar:file:/home/lemmy/ccf/tla/consensus/tla2tools.jar!/tla2sany/StandardModules/TLCExt.tla)
Parsing file /tmp/tlc-18063054040580557503/Integers.tla (jar:file:/home/lemmy/ccf/tla/consensus/tla2tools.jar!/tla2sany/StandardModules/Integers.tla)
Semantic processing of module Naturals
Semantic processing of module Sequences
Semantic processing of module FiniteSets
Semantic processing of module TLC
Semantic processing of module Folds
Semantic processing of module Functions
Semantic processing of module FiniteSetsExt
Semantic processing of module SequencesExt
Semantic processing of module Bags
Semantic processing of module Network
Semantic processing of module ccfraft
Semantic processing of module Json
Semantic processing of module StatsFile
Semantic processing of module Integers
Semantic processing of module TLCExt
Semantic processing of module MCAliases
Semantic processing of module _TLCTrace
Semantic processing of module MCccfraft
Starting... (2024-03-08 16:04:31)
Computing initial states...
Finished computing initial states: 1 distinct state generated at 2024-03-08 16:04:34.
Progress(33) at 2024-03-08 16:04:37: 96,403 states generated (96,403 s/min), 28,187 distinct states found (28,187 ds/min), 8,258 states left on queue.
Progress(43) at 2024-03-08 16:05:37: 5,729,316 states generated (5,632,913 s/min), 1,434,807 distinct states found (1,406,620 ds/min), 382,327 states left on queue.
Progress(46) at 2024-03-08 16:06:37: 11,424,234 states generated (5,694,918 s/min), 2,975,188 distinct states found (1,540,381 ds/min), 830,278 states left on queue.
Progress(47) at 2024-03-08 16:07:37: 17,863,908 states generated (6,439,674 s/min), 4,787,731 distinct states found (1,812,543 ds/min), 1,380,805 states left on queue.
Progress(48) at 2024-03-08 16:08:37: 24,413,625 states generated (6,549,717 s/min), 6,668,738 distinct states found (1,881,007 ds/min), 1,964,555 states left on queue.
Progress(49) at 2024-03-08 16:09:37: 30,794,133 states generated (6,380,508 s/min), 8,481,360 distinct states found (1,812,622 ds/min), 2,499,690 states left on queue.
Progress(49) at 2024-03-08 16:10:37: 37,083,516 states generated (6,289,383 s/min), 10,307,747 distinct states found (1,826,387 ds/min), 3,053,258 states left on queue.
Progress(50) at 2024-03-08 16:11:37: 43,168,157 states generated (6,084,641 s/min), 12,014,700 distinct states found (1,706,953 ds/min), 3,519,153 states left on queue.
Progress(50) at 2024-03-08 16:12:37: 49,359,837 states generated (6,191,680 s/min), 13,803,410 distinct states found (1,788,710 ds/min), 4,025,313 states left on queue.
Progress(50) at 2024-03-08 16:13:37: 55,634,319 states generated (6,274,482 s/min), 15,563,006 distinct states found (1,759,596 ds/min), 4,518,897 states left on queue.
Progress(51) at 2024-03-08 16:14:37: 61,629,168 states generated (5,994,849 s/min), 17,182,702 distinct states found (1,619,696 ds/min), 4,885,909 states left on queue.
Progress(51) at 2024-03-08 16:15:37: 67,923,038 states generated (6,293,870 s/min), 18,978,357 distinct states found (1,795,655 ds/min), 5,348,422 states left on queue.
Progress(51) at 2024-03-08 16:16:37: 74,071,386 states generated (6,148,348 s/min), 20,672,736 distinct states found (1,694,379 ds/min), 5,758,589 states left on queue.
Progress(52) at 2024-03-08 16:17:37: 80,337,318 states generated (6,265,932 s/min), 22,372,909 distinct states found (1,700,173 ds/min), 6,156,555 states left on queue.
Progress(52) at 2024-03-08 16:18:37: 86,263,623 states generated (5,926,305 s/min), 23,975,276 distinct states found (1,602,367 ds/min), 6,501,069 states left on queue.
Progress(52) at 2024-03-08 16:19:37: 92,310,298 states generated (6,046,675 s/min), 25,662,155 distinct states found (1,686,879 ds/min), 6,889,409 states left on queue.
Progress(52) at 2024-03-08 16:20:37: 98,434,313 states generated (6,124,015 s/min), 27,333,576 distinct states found (1,671,421 ds/min), 7,285,619 states left on queue.
Progress(52) at 2024-03-08 16:21:37: 104,792,771 states generated (6,358,458 s/min), 29,041,397 distinct states found (1,707,821 ds/min), 7,638,454 states left on queue.
Progress(53) at 2024-03-08 16:22:37: 110,319,753 states generated (5,526,982 s/min), 30,447,530 distinct states found (1,406,133 ds/min), 7,850,966 states left on queue.
Progress(53) at 2024-03-08 16:23:37: 116,000,671 states generated (5,680,918 s/min), 31,990,566 distinct states found (1,543,036 ds/min), 8,212,762 states left on queue.
Progress(53) at 2024-03-08 16:24:37: 122,429,673 states generated (6,429,002 s/min), 33,734,030 distinct states found (1,743,464 ds/min), 8,560,949 states left on queue.
Progress(53) at 2024-03-08 16:25:37: 128,050,021 states generated (5,620,348 s/min), 35,301,237 distinct states found (1,567,207 ds/min), 8,907,185 states left on queue.
Progress(53) at 2024-03-08 16:26:37: 134,302,149 states generated (6,252,128 s/min), 36,894,876 distinct states found (1,593,639 ds/min), 9,163,819 states left on queue.
Progress(53) at 2024-03-08 16:27:37: 140,539,064 states generated (6,236,915 s/min), 38,538,614 distinct states found (1,643,738 ds/min), 9,485,674 states left on queue.
Progress(54) at 2024-03-08 16:28:37: 146,897,815 states generated (6,358,751 s/min), 40,103,837 distinct states found (1,565,223 ds/min), 9,660,838 states left on queue.
Progress(54) at 2024-03-08 16:29:37: 153,143,072 states generated (6,245,257 s/min), 41,732,474 distinct states found (1,628,637 ds/min), 9,955,684 states left on queue.
Progress(54) at 2024-03-08 16:30:37: 159,523,608 states generated (6,380,536 s/min), 43,474,500 distinct states found (1,742,026 ds/min), 10,317,809 states left on queue.
Progress(54) at 2024-03-08 16:31:37: 165,859,941 states generated (6,336,333 s/min), 45,201,621 distinct states found (1,727,121 ds/min), 10,681,008 states left on queue.
Progress(54) at 2024-03-08 16:32:37: 172,102,960 states generated (6,243,019 s/min), 46,852,219 distinct states found (1,650,598 ds/min), 10,963,040 states left on queue.
Progress(54) at 2024-03-08 16:33:37: 178,187,180 states generated (6,084,220 s/min), 48,268,931 distinct states found (1,416,712 ds/min), 11,065,456 states left on queue.
Progress(54) at 2024-03-08 16:34:37: 184,319,516 states generated (6,132,336 s/min), 49,876,603 distinct states found (1,607,672 ds/min), 11,359,653 states left on queue.
Progress(55) at 2024-03-08 16:35:37: 190,798,902 states generated (6,479,386 s/min), 51,418,388 distinct states found (1,541,785 ds/min), 11,479,678 states left on queue.
Progress(55) at 2024-03-08 16:36:37: 196,884,220 states generated (6,085,318 s/min), 53,005,310 distinct states found (1,586,922 ds/min), 11,721,009 states left on queue.
Progress(55) at 2024-03-08 16:37:37: 203,266,109 states generated (6,381,889 s/min), 54,612,401 distinct states found (1,607,091 ds/min), 11,910,850 states left on queue.
Progress(55) at 2024-03-08 16:38:37: 209,268,190 states generated (6,002,081 s/min), 56,195,752 distinct states found (1,583,351 ds/min), 12,240,491 states left on queue.
Progress(55) at 2024-03-08 16:39:37: 215,031,403 states generated (5,763,213 s/min), 57,750,381 distinct states found (1,554,629 ds/min), 12,524,973 states left on queue.
Progress(55) at 2024-03-08 16:40:37: 221,287,100 states generated (6,255,697 s/min), 59,340,518 distinct states found (1,590,137 ds/min), 12,707,015 states left on queue.
Progress(55) at 2024-03-08 16:41:37: 227,381,581 states generated (6,094,481 s/min), 60,747,559 distinct states found (1,407,041 ds/min), 12,800,653 states left on queue.
Progress(55) at 2024-03-08 16:42:37: 233,656,468 states generated (6,274,887 s/min), 62,320,455 distinct states found (1,572,896 ds/min), 13,022,656 states left on queue.
Progress(56) at 2024-03-08 16:43:37: 240,108,708 states generated (6,452,240 s/min), 63,799,350 distinct states found (1,478,895 ds/min), 13,106,801 states left on queue.
Progress(56) at 2024-03-08 16:44:37: 246,183,030 states generated (6,074,322 s/min), 65,340,443 distinct states found (1,541,093 ds/min), 13,270,306 states left on queue.
Progress(56) at 2024-03-08 16:45:37: 252,692,475 states generated (6,509,445 s/min), 66,863,099 distinct states found (1,522,656 ds/min), 13,349,453 states left on queue.
Progress(56) at 2024-03-08 16:46:37: 259,026,665 states generated (6,334,190 s/min), 68,435,827 distinct states found (1,572,728 ds/min), 13,508,565 states left on queue.
Progress(56) at 2024-03-08 16:47:37: 264,880,797 states generated (5,854,132 s/min), 69,986,524 distinct states found (1,550,697 ds/min), 13,818,044 states left on queue.
Progress(56) at 2024-03-08 16:48:37: 270,881,263 states generated (6,000,466 s/min), 71,548,534 distinct states found (1,562,010 ds/min), 14,016,209 states left on queue.
Progress(62) at 2024-03-08 17:56:38: 646,420,912 states generated (5,527,216 s/min), 155,577,287 distinct states found (1,239,972 ds/min), 13,654,652 states left on queue.
Progress(62) at 2024-03-08 17:57:38: 652,535,218 states generated (6,114,306 s/min), 156,837,074 distinct states found (1,259,787 ds/min), 13,509,890 states left on queue.
Progress(60) at 2024-03-08 17:39:38: 551,034,296 states generated (5,305,654 s/min), 135,580,738 distinct states found (1,099,618 ds/min), 15,277,343 states left on queue.
Progress(60) at 2024-03-08 17:40:38: 556,794,270 states generated (5,759,974 s/min), 136,705,742 distinct states found (1,125,004 ds/min), 15,115,946 states left on queue.
Progress(60) at 2024-03-08 17:41:38: 562,152,712 states generated (5,358,442 s/min), 137,896,782 distinct states found (1,191,040 ds/min), 15,073,235 states left on queue.
Progress(60) at 2024-03-08 17:42:38: 567,427,831 states generated (5,275,119 s/min), 138,981,094 distinct states found (1,084,312 ds/min), 15,022,668 states left on queue.
Progress(61) at 2024-03-08 17:43:38: 573,550,544 states generated (6,122,713 s/min), 140,157,291 distinct states found (1,176,197 ds/min), 14,799,408 states left on queue.
Progress(61) at 2024-03-08 17:44:38: 578,980,587 states generated (5,430,043 s/min), 141,401,952 distinct states found (1,244,661 ds/min), 14,782,497 states left on queue.
Progress(61) at 2024-03-08 17:45:38: 584,658,535 states generated (5,677,948 s/min), 142,572,635 distinct states found (1,170,683 ds/min), 14,679,793 states left on queue.
Progress(61) at 2024-03-08 17:46:38: 590,207,173 states generated (5,548,638 s/min), 143,806,097 distinct states found (1,233,462 ds/min), 14,632,088 states left on queue.
Progress(61) at 2024-03-08 17:47:38: 595,495,990 states generated (5,288,817 s/min), 144,955,984 distinct states found (1,149,887 ds/min), 14,556,802 states left on queue.
Progress(61) at 2024-03-08 17:48:38: 600,664,180 states generated (5,168,190 s/min), 146,139,301 distinct states found (1,183,317 ds/min), 14,605,536 states left on queue.
Progress(61) at 2024-03-08 17:49:38: 606,315,636 states generated (5,651,456 s/min), 147,345,001 distinct states found (1,205,700 ds/min), 14,548,784 states left on queue.
Progress(61) at 2024-03-08 17:50:38: 612,048,979 states generated (5,733,343 s/min), 148,589,604 distinct states found (1,244,603 ds/min), 14,458,087 states left on queue.
Progress(61) at 2024-03-08 17:51:38: 617,797,548 states generated (5,748,569 s/min), 149,758,858 distinct states found (1,169,254 ds/min), 14,303,280 states left on queue.
Progress(61) at 2024-03-08 17:52:38: 623,503,136 states generated (5,705,588 s/min), 150,826,126 distinct states found (1,067,268 ds/min), 14,092,041 states left on queue.
Progress(61) at 2024-03-08 17:53:38: 629,260,121 states generated (5,756,985 s/min), 152,077,652 distinct states found (1,251,526 ds/min), 14,022,159 states left on queue.
Progress(61) at 2024-03-08 17:54:38: 634,775,720 states generated (5,515,599 s/min), 153,124,416 distinct states found (1,046,764 ds/min), 13,870,661 states left on queue.
Progress(62) at 2024-03-08 17:55:38: 640,893,696 states generated (6,117,976 s/min), 154,337,315 distinct states found (1,212,899 ds/min), 13,671,345 states left on queue.
Progress(62) at 2024-03-08 17:56:38: 646,420,912 states generated (5,527,216 s/min), 155,577,287 distinct states found (1,239,972 ds/min), 13,654,652 states left on queue.
Progress(62) at 2024-03-08 17:57:38: 652,535,218 states generated (6,114,306 s/min), 156,837,074 distinct states found (1,259,787 ds/min), 13,509,890 states left on queue.
Progress(62) at 2024-03-08 17:58:38: 658,145,121 states generated (5,609,903 s/min), 157,990,938 distinct states found (1,153,864 ds/min), 13,358,950 states left on queue.
Progress(62) at 2024-03-08 17:59:38: 664,135,851 states generated (5,990,730 s/min), 159,326,320 distinct states found (1,335,382 ds/min), 13,340,266 states left on queue.
Progress(62) at 2024-03-08 18:00:38: 670,477,698 states generated (6,341,847 s/min), 160,610,503 distinct states found (1,284,183 ds/min), 13,200,423 states left on queue.
Progress(62) at 2024-03-08 18:01:38: 676,967,614 states generated (6,489,916 s/min), 161,946,541 distinct states found (1,336,038 ds/min), 13,042,149 states left on queue.
Progress(62) at 2024-03-08 18:02:38: 683,796,846 states generated (6,829,232 s/min), 163,246,835 distinct states found (1,300,294 ds/min), 12,758,125 states left on queue.
Progress(62) at 2024-03-08 18:03:38: 690,602,692 states generated (6,805,846 s/min), 164,616,906 distinct states found (1,370,071 ds/min), 12,578,933 states left on queue.
Progress(62) at 2024-03-08 18:04:38: 697,139,480 states generated (6,536,788 s/min), 165,805,329 distinct states found (1,188,423 ds/min), 12,340,682 states left on queue.
Progress(63) at 2024-03-08 18:05:38: 703,806,660 states generated (6,667,180 s/min), 167,229,933 distinct states found (1,424,604 ds/min), 12,196,427 states left on queue.
Progress(63) at 2024-03-08 18:06:38: 710,486,705 states generated (6,680,045 s/min), 168,598,848 distinct states found (1,368,915 ds/min), 12,030,511 states left on queue.
Progress(63) at 2024-03-08 18:07:38: 717,115,963 states generated (6,629,258 s/min), 169,946,359 distinct states found (1,347,511 ds/min), 11,845,052 states left on queue.
Progress(63) at 2024-03-08 18:08:38: 722,758,171 states generated (5,642,208 s/min), 171,123,915 distinct states found (1,177,556 ds/min), 11,739,359 states left on queue.
Progress(63) at 2024-03-08 18:09:38: 728,464,375 states generated (5,706,204 s/min), 172,226,069 distinct states found (1,102,154 ds/min), 11,544,849 states left on queue.
Progress(63) at 2024-03-08 18:10:38: 734,314,833 states generated (5,850,458 s/min), 173,391,622 distinct states found (1,165,553 ds/min), 11,358,143 states left on queue.
Progress(63) at 2024-03-08 18:11:38: 740,908,180 states generated (6,593,347 s/min), 174,591,593 distinct states found (1,199,971 ds/min), 11,035,465 states left on queue.
Progress(63) at 2024-03-08 18:12:38: 747,840,097 states generated (6,931,917 s/min), 175,940,292 distinct states found (1,348,699 ds/min), 10,837,890 states left on queue.
Progress(64) at 2024-03-08 18:13:38: 754,523,348 states generated (6,683,251 s/min), 177,206,116 distinct states found (1,265,824 ds/min), 10,555,590 states left on queue.
Progress(64) at 2024-03-08 18:14:38: 760,793,755 states generated (6,270,407 s/min), 178,486,404 distinct states found (1,280,288 ds/min), 10,393,062 states left on queue.
Progress(64) at 2024-03-08 18:15:38: 767,622,733 states generated (6,828,978 s/min), 179,837,231 distinct states found (1,350,827 ds/min), 10,154,762 states left on queue.
Progress(64) at 2024-03-08 18:16:38: 774,393,887 states generated (6,771,154 s/min), 181,218,189 distinct states found (1,380,958 ds/min), 9,984,538 states left on queue.
Progress(64) at 2024-03-08 18:17:38: 781,397,434 states generated (7,003,547 s/min), 182,577,602 distinct states found (1,359,413 ds/min), 9,710,179 states left on queue.
Progress(64) at 2024-03-08 18:18:38: 788,635,095 states generated (7,237,661 s/min), 183,851,320 distinct states found (1,273,718 ds/min), 9,317,915 states left on queue.
Progress(64) at 2024-03-08 18:19:38: 795,436,216 states generated (6,801,121 s/min), 185,118,752 distinct states found (1,267,432 ds/min), 9,067,468 states left on queue.
Progress(65) at 2024-03-08 18:20:38: 801,741,004 states generated (6,304,788 s/min), 186,343,846 distinct states found (1,225,094 ds/min), 8,809,810 states left on queue.
Progress(65) at 2024-03-08 18:21:38: 808,153,506 states generated (6,412,502 s/min), 187,620,337 distinct states found (1,276,491 ds/min), 8,602,494 states left on queue.
Progress(65) at 2024-03-08 18:22:38: 814,436,953 states generated (6,283,447 s/min), 188,840,825 distinct states found (1,220,488 ds/min), 8,352,880 states left on queue.
Progress(65) at 2024-03-08 18:23:38: 821,023,830 states generated (6,586,877 s/min), 190,064,136 distinct states found (1,223,311 ds/min), 8,088,038 states left on queue.
Progress(65) at 2024-03-08 18:24:38: 827,434,846 states generated (6,411,016 s/min), 191,265,485 distinct states found (1,201,349 ds/min), 7,783,118 states left on queue.
Progress(65) at 2024-03-08 18:25:38: 834,167,874 states generated (6,733,028 s/min), 192,447,726 distinct states found (1,182,241 ds/min), 7,453,753 states left on queue.
Progress(66) at 2024-03-08 18:26:38: 840,602,072 states generated (6,434,198 s/min), 193,663,012 distinct states found (1,215,286 ds/min), 7,160,170 states left on queue.
Progress(66) at 2024-03-08 18:27:38: 846,940,640 states generated (6,338,568 s/min), 194,906,047 distinct states found (1,243,035 ds/min), 6,954,065 states left on queue.
Progress(66) at 2024-03-08 18:28:38: 853,309,920 states generated (6,369,280 s/min), 196,088,969 distinct states found (1,182,922 ds/min), 6,663,110 states left on queue.
Progress(66) at 2024-03-08 18:29:38: 859,779,215 states generated (6,469,295 s/min), 197,276,162 distinct states found (1,187,193 ds/min), 6,351,490 states left on queue.
Progress(66) at 2024-03-08 18:30:38: 866,609,944 states generated (6,830,729 s/min), 198,445,427 distinct states found (1,169,265 ds/min), 5,973,857 states left on queue.
Progress(67) at 2024-03-08 18:31:38: 872,903,300 states generated (6,293,356 s/min), 199,628,828 distinct states found (1,183,401 ds/min), 5,692,738 states left on queue.
Progress(67) at 2024-03-08 18:32:38: 879,291,118 states generated (6,387,818 s/min), 200,854,407 distinct states found (1,225,579 ds/min), 5,428,038 states left on queue.
Progress(67) at 2024-03-08 18:33:38: 885,838,312 states generated (6,547,194 s/min), 202,007,809 distinct states found (1,153,402 ds/min), 5,069,662 states left on queue.
Progress(67) at 2024-03-08 18:34:38: 892,558,394 states generated (6,720,082 s/min), 203,129,018 distinct states found (1,121,209 ds/min), 4,658,919 states left on queue.
Progress(68) at 2024-03-08 18:35:38: 899,065,504 states generated (6,507,110 s/min), 204,355,288 distinct states found (1,226,270 ds/min), 4,363,696 states left on queue.
Progress(68) at 2024-03-08 18:36:38: 905,767,830 states generated (6,702,326 s/min), 205,510,653 distinct states found (1,155,365 ds/min), 3,967,401 states left on queue.
Progress(68) at 2024-03-08 18:37:38: 912,380,551 states generated (6,612,721 s/min), 206,622,750 distinct states found (1,112,097 ds/min), 3,536,222 states left on queue.
Progress(69) at 2024-03-08 18:38:38: 918,910,059 states generated (6,529,508 s/min), 207,828,221 distinct states found (1,205,471 ds/min), 3,215,751 states left on queue.
Progress(69) at 2024-03-08 18:39:38: 925,734,751 states generated (6,824,692 s/min), 208,948,080 distinct states found (1,119,859 ds/min), 2,720,136 states left on queue.
Progress(70) at 2024-03-08 18:40:38: 932,352,579 states generated (6,617,828 s/min), 210,126,153 distinct states found (1,178,073 ds/min), 2,370,373 states left on queue.
Progress(70) at 2024-03-08 18:41:38: 938,937,615 states generated (6,585,036 s/min), 211,192,108 distinct states found (1,065,955 ds/min), 1,896,866 states left on queue.
Progress(71) at 2024-03-08 18:42:38: 945,540,936 states generated (6,603,321 s/min), 212,353,511 distinct states found (1,161,403 ds/min), 1,496,134 states left on queue.
Progress(72) at 2024-03-08 18:43:38: 952,116,280 states generated (6,575,344 s/min), 213,466,325 distinct states found (1,112,814 ds/min), 1,054,424 states left on queue.
Progress(74) at 2024-03-08 18:44:38: 958,645,146 states generated (6,528,866 s/min), 214,587,859 distinct states found (1,121,534 ds/min), 604,097 states left on queue.
Progress(79) at 2024-03-08 18:45:38: 964,833,137 states generated (6,187,991 s/min), 215,722,330 distinct states found (1,134,471 ds/min), 111,369 states left on queue.
"Writing stats to file: MCccfraftWithReconfig_stats.json"
Model checking completed. No error has been found.
  Estimates of the probability that TLC did not check all reachable states
  because two distinct states had the same fingerprint:
  calculated (optimistic):  val = .0088
  based on the actual fingerprints:  val = .0033
966095841 states generated, 216150986 distinct states found, 0 states left on queue.
The depth of the complete state graph search is 99.
The average outdegree of the complete state graph is 1 (minimum is 0, the maximum 7 and the 95th percentile is 2).
Finished in 02h 41min at (2024-03-08 18:46:00)
lemmy commented 3 months ago
lemmy@ccf-model-checking:~/ccf/tla/consensus$ git log -n1
commit 1c83e53c9c208ded61afd8e05ee878a12f83eb6c (HEAD -> update_retirement_behaviour, achamayou/update_retirement_behaviour)
Author: Amaury Chamayou <amchamay@microsoft.com>
Date:   Fri Mar 8 17:30:20 2024 +0000

    .
lemmy@ccf-model-checking:~/ccf/tla/consensus$ ./tlc
[0.323s][info][jfr,startup] Started recording 1.
[0.323s][info][jfr,startup]
[0.323s][info][jfr,startup] Use jcmd 59047 JFR.dump name=1 to copy recording data to file.
TLC2 Version 2.18 of Day Month 20?? (rev: a422f06)
Running breadth-first search Model-Checking with fp 22 and seed -729127409143924031 with 96 workers on 96 cores with 31403MB heap and 131072MB offheap memory [pid: 59047] (Linux 5.15.0-1058-azure amd64, Private Build 21.0.2 x86_64, OffHeapDiskFPSet, DiskByteArrayQueue).
Parsing file /home/lemmy/ccf/tla/consensus/MCccfraft.tla
Parsing file /home/lemmy/ccf/tla/consensus/ccfraft.tla
Parsing file /home/lemmy/ccf/tla/consensus/StatsFile.tla
Parsing file /home/lemmy/ccf/tla/consensus/MCAliases.tla
Parsing file /tmp/tlc-15342996497103544925/_TLCTrace.tla (jar:file:/home/lemmy/ccf/tla/consensus/tla2tools.jar!/tla2sany/StandardModules/_TLCTrace.tla)
Parsing file /tmp/tlc-15342996497103544925/Naturals.tla (jar:file:/home/lemmy/ccf/tla/consensus/tla2tools.jar!/tla2sany/StandardModules/Naturals.tla)
Parsing file /tmp/tlc-15342996497103544925/FiniteSets.tla (jar:file:/home/lemmy/ccf/tla/consensus/tla2tools.jar!/tla2sany/StandardModules/FiniteSets.tla)
Parsing file /tmp/tlc-15342996497103544925/Sequences.tla (jar:file:/home/lemmy/ccf/tla/consensus/tla2tools.jar!/tla2sany/StandardModules/Sequences.tla)
Parsing file /tmp/tlc-15342996497103544925/TLC.tla (jar:file:/home/lemmy/ccf/tla/consensus/tla2tools.jar!/tla2sany/StandardModules/TLC.tla)
Parsing file /tmp/tlc-15342996497103544925/FiniteSetsExt.tla (jar:file:/home/lemmy/ccf/tla/consensus/CommunityModules-deps.jar!/FiniteSetsExt.tla)
Parsing file /tmp/tlc-15342996497103544925/SequencesExt.tla (jar:file:/home/lemmy/ccf/tla/consensus/CommunityModules-deps.jar!/SequencesExt.tla)
Parsing file /tmp/tlc-15342996497103544925/Functions.tla (jar:file:/home/lemmy/ccf/tla/consensus/CommunityModules-deps.jar!/Functions.tla)
Parsing file /tmp/tlc-15342996497103544925/Folds.tla (jar:file:/home/lemmy/ccf/tla/consensus/CommunityModules-deps.jar!/Folds.tla)
Parsing file /tmp/tlc-15342996497103544925/Json.tla (jar:file:/home/lemmy/ccf/tla/consensus/tla2tools.jar!/tla2sany/StandardModules/Json.tla)
Parsing file /home/lemmy/ccf/tla/consensus/Network.tla
Parsing file /tmp/tlc-15342996497103544925/Bags.tla (jar:file:/home/lemmy/ccf/tla/consensus/tla2tools.jar!/tla2sany/StandardModules/Bags.tla)
Parsing file /tmp/tlc-15342996497103544925/TLCExt.tla (jar:file:/home/lemmy/ccf/tla/consensus/tla2tools.jar!/tla2sany/StandardModules/TLCExt.tla)
Parsing file /tmp/tlc-15342996497103544925/Integers.tla (jar:file:/home/lemmy/ccf/tla/consensus/tla2tools.jar!/tla2sany/StandardModules/Integers.tla)
Semantic processing of module Naturals
Semantic processing of module Sequences
Semantic processing of module FiniteSets
Semantic processing of module TLC
Semantic processing of module Folds
Semantic processing of module Functions
Semantic processing of module FiniteSetsExt
Semantic processing of module SequencesExt
Semantic processing of module Bags
Semantic processing of module Network
Semantic processing of module ccfraft
Semantic processing of module Json
Semantic processing of module StatsFile
Semantic processing of module Integers
Semantic processing of module TLCExt
Semantic processing of module MCAliases
Semantic processing of module _TLCTrace
Semantic processing of module MCccfraft
Starting... (2024-03-08 20:29:54)
Computing initial states...
Finished computing initial states: 1 distinct state generated at 2024-03-08 20:29:57.
Progress(36) at 2024-03-08 20:30:00: 200,634 states generated (200,634 s/min), 73,637 distinct states found (73,637 ds/min), 24,805 states left on queue.
Progress(52) at 2024-03-08 20:31:00: 6,799,230 states generated (6,598,596 s/min), 1,835,387 distinct states found (1,761,750 ds/min), 238,839 states left on queue.
Progress(56) at 2024-03-08 20:32:00: 13,761,722 states generated (6,962,492 s/min), 3,663,323 distinct states found (1,827,936 ds/min), 627,171 states left on queue.
Progress(58) at 2024-03-08 20:33:00: 20,592,651 states generated (6,830,929 s/min), 5,368,991 distinct states found (1,705,668 ds/min), 893,481 states left on queue.
Progress(60) at 2024-03-08 20:34:00: 27,146,672 states generated (6,554,021 s/min), 6,922,792 distinct states found (1,553,801 ds/min), 994,562 states left on queue.
Progress(61) at 2024-03-08 20:35:00: 34,112,154 states generated (6,965,482 s/min), 8,535,165 distinct states found (1,612,373 ds/min), 1,108,566 states left on queue.
Progress(63) at 2024-03-08 20:36:00: 40,912,395 states generated (6,800,241 s/min), 10,016,102 distinct states found (1,480,937 ds/min), 1,030,203 states left on queue.
Progress(64) at 2024-03-08 20:37:00: 47,613,450 states generated (6,701,055 s/min), 11,431,092 distinct states found (1,414,990 ds/min), 945,936 states left on queue.
Progress(66) at 2024-03-08 20:38:00: 54,549,763 states generated (6,936,313 s/min), 12,818,735 distinct states found (1,387,643 ds/min), 743,862 states left on queue.
Progress(69) at 2024-03-08 20:39:00: 61,694,970 states generated (7,145,207 s/min), 14,135,254 distinct states found (1,316,519 ds/min), 419,585 states left on queue.
"Writing stats to file: MCccfraftAtomicReconfig_stats.json"
Model checking completed. No error has been found.
  Estimates of the probability that TLC did not check all reachable states
  because two distinct states had the same fingerprint:
  calculated (optimistic):  val = 4.3E-5
  based on the actual fingerprints:  val = 4.3E-5
67553103 states generated, 15125059 distinct states found, 0 states left on queue.
The depth of the complete state graph search is 83.
The average outdegree of the complete state graph is 1 (minimum is 0, the maximum 7 and the 95th percentile is 3).
Finished in 10min 01s at (2024-03-08 20:39:55)
lemmy commented 3 months ago

MCccfraftAtomicReconfig.cfg with MaxClientRequest = 2 made it to diameter 71 before the spot instance was killed:

in), 60,195,842 states left on queue.
Progress(71) at 2024-03-09 02:33:48: 1,866,804,206 states generated (5,202,228 s/min), 439,548,606 distinct states found (1,185,940 ds/min), 60,500,194 states left on queue.
Progress(71) at 2024-03-09 02:34:48: 1,871,900,059 states generated (5,095,853 s/min), 440,601,085 distinct states found (1,052,479 ds/min), 60,505,218 states left on queue.
Progress(71) at 2024-03-09 02:35:48: 1,877,092,170 states generated (5,192,111 s/min), 441,859,618 distinct states found (1,258,533 ds/min), 60,825,327 states left on queue.
Progress(71) at 2024-03-09 02:36:48: 1,882,477,331 states generated (5,385,161 s/min), 442,980,211 distinct states found (1,120,593 ds/min), 60,846,176 states left on queue.
Progress(71) at 2024-03-09 02:37:48: 1,887,923,773 states generated (5,446,442 s/min), 444,088,513 distinct states found (1,108,302 ds/min), 60,832,651 states left on queue.
Progress(71) at 2024-03-09 02:38:48: 1,893,332,583 states generated (5,408,810 s/min), 445,204,253 distinct states found (1,115,740 ds/min), 60,901,538 states left on queue.
lemmy commented 3 months ago
  • how this interacts with snapshots (specifically, a node started from a snapshot emitted while a reconfiguration is in-progress - how does it interact with retiring nodes?), but I think it's safe to look at that in a separate PR since we have so no notion of snapshotting in the driver or spec.

For posterity: Modeling snapshots in the specification should be straightforward; a node non-deterministically loses all its state except for whats contained in a snapshot?

eddyashton commented 3 months ago

For posterity: Modeling snapshots in the specification should be straightforward; a node non-deterministically loses all its state except for whats contained in a snapshot?

No, snapshotting in CCF is a node startup mechanism, where that sounds like a restart behaviour. Roughly speaking, it means that a node has some initial state when it joins/when it is created. Establishing exactly what that state is and how it maps to the state in the spec, I haven't considered.

lemmy commented 3 months ago

What's part of a snapshot?

eddyashton commented 3 months ago

What's part of a snapshot?

A complete state of the KV, from which we can rehydrate other things (eg - configurations for Raft). Maybe we can construct an equivalent from the model's reconfiguration transactions, haven't looked.

lemmy commented 3 months ago

\o/