Closed mathieumb closed 7 years ago
@jeremybarnes @mailletf
@jeremybarnes
@jeremybarnes ping
Lots of small-ish comments; and a couple of more serious issues. Also careful in the copying of vectors; the accuracy code is part of the basic workflow and it is a productivity sink if it's slow.
In general the approach looks good, it just needs some whipping into shape. I would also consider moving this to a couple of smaller PRs where you do the refactoring first, secondly you build the basic machinery, and thirdly you modify MLDB to incorporate it. That way it would be much clearer in terms of impact and intent.
Verbal +1 from @jeremybarnes
The goal is to handle classification problems where each example has a set of labels instead of a single one. (for example, tagging content)
With 3 options, two trivial for comparison purpose and one-vs-all, which will train a probabilized, binary classifier for each possible label.