Revise API for incentive schemes

The current API for implementing incentive schemes looks like this.

https://github.com/pkel/cpr/blob/3f8cc5eae624c7f72c4ed8c19abad9de3d00d5ca/ocaml/lib/intf.ml#L37-L39

I think it would be easier to add something like a coinbase transaction to each block. E.g. the referee could define a function

val reward: data -> (int * float) list

Previously, the incentive scheme assigned rewards to vertices. The framework looked up the origin of the vertex and redirected the reward to the originating node.

With the new scheme this is not possible. Means we cannot assign rewards to votes. But the new scheme would hand out the vote rewards with the next block.

The new scheme works better with deterministic appends, where vertices can have more than one origin.

One feature of the existing API is that one protocol can define multiple reward schemes. If we want to keep this, we could do something like

val reward: scheme -> data -> (int * float) list

The simulator could accumulate the past rewards for each DAG vertex. This would simplify the implementation of the RL engine, where we currently recalculate the rewards for the whole chain on each step. See https://github.com/pkel/cpr/blob/3f8cc5eae624c7f72c4ed8c19abad9de3d00d5ca/ocaml/gym/engine.ml#L277-L295

pkel / cpr

Revise API for incentive schemes #14