basho / riak_kv

Riak Key/Value Store
Apache License 2.0
650 stars 233 forks source link

TictacAAE fullsync - need for further safety measures #1775

Closed martinsumner closed 3 years ago

martinsumner commented 3 years ago

In part related to https://github.com/basho/riak_kv/issues/1765. When testing of intra-cluster AAE with very large vnodes, and large deltas (either due to genuine deltas or false deltas when awaiting cache repairs) then the behaviour of Tictac AAE was neither efficient or sufficiently conservative.

Some of the measures related to intra-cluster AAE, will also lead to improvement in inter-cluster AAE (such as active pruning of the AAE runner queue). However, there still exists the risk that inter-cluster AAE behaviour may not be predictable, conservative or efficient in the face of large deltas with large stores.

Following changes proposed:

martinsumner commented 3 years ago

After further investigation, some issues have arisen.

  1. There is a fundamental bug in the per-bucket full-sync. There is conflicting view in Riak between the external and internal clients on the format of a modified range - {date, Low, High} or {Low, High}. This isn't handled correctly in riak_kv_ttaaefs_manager, and so the modified range was being ignore on the local client.

  2. The behaviour of fetch_clocks for clock_compare is different between full fullsync (e.g. nval based) and partial fullsync (e.g. bucket based). This difference isn't an immediate problem, but perhaps increases the potential for confusion in the long term. Currently nval fetch_clocks will use the aae_runner for each vnode query, and attempt to re-write the segments (potentially correcting) in the tree cache as it runs through the query. the per-bucket alternative uses the AF3_QUEUE riak_core node_worker_pool, and does not re-write/repair segments.

The nextgenrepl_ttaaefs_manual riak_test has been extended to catch (1) - https://github.com/basho/riak_test/pull/1352. A fix has been tested, as well as a switch to control the behaviour for (2) - https://github.com/basho/riak_kv/pull/1776.

martinsumner commented 3 years ago

If the intention is not to repair tree caches as part of fetch_clocks when running full ttaaefs fullsync, then it becomes easier to consider supporting a schedule that includes hour and day syncs.

In an hour or day sync, a complete comparison would be made between the AAEtree caches, but in the hour sync it will assume any delta was in the last hour of changes (by modified date). likewise the day sync will assume any delta is in the last day of changes.

These fetch_clocks queries can then be run with a last-modified-date range, and on very large stores this will be much faster than a full nval compare.

martinsumner commented 3 years ago

https://github.com/basho/riak_kv/pull/1776