Open joshmh opened 9 years ago
Improved periodic global counter increment logic in a client could work as follows:
Trigger maintainCounter
every k
milliseconds, where a reasonable value would be 20ms.
function maintainCounter() {
if (noNewRecordsSinceLastCheck) return;
if (localCounter % n === 0 && noRecordsWithCurrentGlobalCounter())
incrementGlobalCounter();
localCounter++;
}
The problem with maintainCounter
is that it's sensitive to event traffic patterns, which could change abruptly. Instead it should be dependent on number of clients connected, which shouldn't change abruptly. The solution is for maintainCounter
to have its own mLocalCounter
and backoff regime.
function maintainCounter() {
if (noNewLocalRecordsSinceLastCheck) return;
if (mcLocalCounter % mcn === 0 && noRecordsWithCurrentGlobalCounter())
incrementGlobalCounter();
if (conflictDetected) mcn <<= 1;
if (mcLocalCounter % 100 === 0 && mcn > 1) mcn >>= 1;
mcLocalCounter++;
}
For simplicity, we can dispense entirely with globalCounter
incrementing while publishing, and therefore with backoff managing for publishing. Then, consuming latency will depend entirely on maintainCounter
.
Currently, Merrimack is not designed for more than ~100 incoming messages per second. In order to remove this limitation, we need to limit global counter incrementation. Here's an idea:
[globalCounter, randomNonce]
globalCounter
is not required to be unique per recordcounter
and a local divisorn
n = 1
globalCounter
(retriable FDBError), the divisorn
is doubledglobalCounter
is only increased whencounter % n === 0
n
is halved (ifn === 1
we just keep it there)globalCounter
are consumed. This is because we can't guarantee that a new record with the currentglobalCounter
but a lowerrandomNonce
won't appear later on.n
is currently high and traffic slows abruptly, it will take a few seconds forn
to get back to1
. In this case, it's possible there would be a few seconds when consuming is slow, because theglobalCounter
is not being incremented fast enough. In fact, consuming could stop altogether if traffic stops completely andglobalCounter
isn't subsequently incremented. This is probably a pretty weird edge case, though.