Activator: Bound memory utilization by compaction and de-duplication

The Activator object allows one to force the scheduling of Timely operators even in the absence of progress changes. Care must be taken to avoid scheduling the operators too often, as all scheduled activations are stored in memory until the scheduler picks them up.

To avoid this, we propose the following changes when activating:

Record the activation as before. Once the size of the internal data structures exceeds a threshold, say 2x the last compacted size, organize the activations. This essentially sorts activations by the operator's path.
Also, de-duplicate the data by collapsing activations of the same operator into one.

In Materialize, we often have a pattern where we activate an upstream operator once a downstream dataflow operator is dropped, or a source has new data and Timely needs to schedule the source operator. To avoid the issue of sending too many activations, we use a pattern where the operator only gets activated once, and once it is running, it'll need to acknowledge the activation, which enables future activations. This pattern works well, but comes with additional complexity for a developer writing Timely operators. The above solution would eliminate the extra burden, at the cost of some (amortized) overhead. For specific operators, the activate-acknowledge pattern might still be useful.

TimelyDataflow / timely-dataflow

Activator: Bound memory utilization by compaction and de-duplication #470