Support multiple learning modes

I've been thinking a bit more about these learning modes and there's actually three distinct things going on: a learning mode, which specifies whether, given a 1 (boundary) label we merge or not; a labeling mode, which specifies how to get the label; and a sampling mode, which specifies how to obtain the merge order. For a while I thought that only some combinations of these made sense, but actually all combinations can work conceptually.

Here are the options so far for each of those: learning modes: strict (my approach) or laissez-faire (Viren's, and RL in general) labeling modes: assignment (mine), Rand change sign (Viren's), VOI sign (new) sampling modes: random (mine), active (Viren's), boundary mean (mine when I had a bug, but it still worked well!) [I don't think you were in this discussion, but I found out that all the results on my poster came from training through mean boundary, not random!]

One could also consider a mixed sampling approach, for example, alternating epochs of random and active sampling.

jni / ray

Support multiple learning modes #45