stripe-archive / brushfire

Distributed decision tree ensemble learning in Scala
Other
391 stars 50 forks source link

Add a CrossEntropyError class. #93

Open erik-stripe opened 8 years ago

erik-stripe commented 8 years ago

This PR gives us another way to evaluate how well predictions do against the actual known distribution. The iris example has been ported to demonstrated this method in practice.

There is also a small refactoring of the local trainer's validate method, and some small refactors of other error classes.

erik-stripe commented 8 years ago

Review by @tixxit, @avibryant, and/or @johnynek.

erik-stripe commented 8 years ago

There are some problems here -- please wait to merge until I fix them (Travis should notice them too).

avibryant commented 8 years ago

Having other people extend Brushfire really makes me feel the lack of having written tests. This makes me feel bad but is good for improving the code.

Here's a law that I think we want to apply for all Error instances that I think this currently fails:

error.semigroup.plus(error.create(a1, p), error.create(a2, p)) ==
error.create(Semigroup.plus(a1,a2),p)

Please note, this does not hold for predictions, that is:

error.semigroup.plus(error.create(a, p1), error.create(a, p2)) !=
error.create(a, Semigroup.plus(p1,p2))
CLAassistant commented 4 years ago

CLA assistant check
All committers have signed the CLA.