DNM: A merge batcher that gracefully handles non-ready data

This PR shows how to implement a merge batcher that is smart about reconsidering data that's in advance of the current frontier. It does the following things:

It extracts ready data from chains instead of merging all chains and then extracting data from the last remaining chain.
It separates canonicalization from the extraction operation, so it can be reused for inserts and extracts.
It memorizes a frontier per block, which allows for efficient frontier testing: If the extraction (upper) frontier is not less or equal to the block frontier, do not touch the block.

This should have the potential to reduce the amount of work for outstanding data from $O(n)$ where $n$ is the number of records in the merge batcher to $O(n/1024)$ by considering only the block itself, but not the data it contains.

I am sorry for the formatting noise which originates from copying this code from DD to Mz and back again :/

We figured that the approach taken in this PR changes what we report as the next lower bound of data to be extracted. As before, seal captures a lower frontier of all the times in the batcher, but its semantics are different. Previously, the reported frontier was accurate, i.e., there existed data at the reported frontier. Now, it's any lower bound, but there is no guarantee that it's accurate.

The reason for this is that in the past, seal would merge all data into a single chain, which means that each (d, t) pair appears at most once. After this change, each (d, t) pair can occur in all chains, which means that we can only compute a lower bound frontier of the uncompacted data, but not the precise lower frontier of the compacted data. We maintain a logarithmic amount of chains.

This can become a problem when all data cancels out, and can cause an unknown amount of additional work for the rest of the system, because it needs to maintain more capabilities and might need to ask for more data more times.

We don't have an immediate solution for this problem, but there are some options:

We can accept this problem if we know that most data doesn't cancel, so the lower frontier should be mostly accurate. This requires that operators can handle hallucinated frontiers.
We can limit the implementation to be only used for totally ordered times, and combining it with time-sorted chains. This would allow us to drain chains from the beginning while the consolidation would cancel out the contents.
Longer term, we could form chains of totally ordered data within a partially ordered domain, where each chain can be consolidated efficiently. I favor this option least because implementation uncertainties.

TimelyDataflow / differential-dataflow

DNM: A merge batcher that gracefully handles non-ready data #463