This PR does two nice things: (a) improves the test of the Mean aggregation scheme to check what actually happens when batch sizes are different (b) adds and tests a Perplexity scheme, which in fact was the original motivation for the aggregation schemes 2 years ago..
This PR does two nice things: (a) improves the test of the
Mean
aggregation scheme to check what actually happens when batch sizes are different (b) adds and tests aPerplexity
scheme, which in fact was the original motivation for the aggregation schemes 2 years ago..@dmitriy-serdyuk , can you please review this?