Extend average_* metrics to use different averages

Adds a new average() helper function which performs a generalized average. Options include: minimum, maximum, harmonic, geometric, arithmetic, and root mean square. This is used in conjunction with recall_score and precision_score with average=None to get the average of the TPR or PPV respectively for each target class. This means it also generalizes to multiclass classification.

In the binary case, average_odds_error should match the original definition exactly. average_odds_difference, however, will compute average([TPR_diff, TNR_diff]) instead of average([TPR_diff, FPR_diff]) like before. This may break some things. On the other hand, I think it's cleaner since the sign of each component always denotes privilege -- i.e. negative TNR_diff means privileged group is better off instead of positive for FPR_diff.

See #88 Also closes #377

TODO

[ ] test cases
[ ] examples (in docstring)

Trusted-AI / AIF360

Extend average_* metrics to use different averages #413