mkabbasi / cleartk

Automatically exported from code.google.com/p/cleartk
0 stars 0 forks source link

Reconsider Classifier.score interface #373

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
Currently Classifier.score looks like:

  public List<ScoredOutcome<OUTCOME_TYPE>> score(List<Feature> features, int maxResults)

This API conflates a number of things:

 (1) Getting scores for each outcome
 (2) Sorting the outcomes by score
 (3) Limiting the number of outcomes

For many use cases, both steps (2) and (3) are unnecessary. I think we should 
consider a much simpler API:

  public Map<OUTCOME_TYPE, Double> score(List<Feature> features)

Anyone who really wanted (2) and (3) could get them from a `scoredOutcomes` map 
with:

    Ordering<OUTCOME_TYPE> ordering = Ordering.natural().onResultOf(Functions.forMap(scoredOutcomes));
    List<OUTCOME_TYPE> outcomes = ordering.sortedCopy(scoredOutcomes.keySet()).subList(0, maxResults);

Note that the same issues apply to SequenceClassifier.

Original issue reported on code.google.com by steven.b...@gmail.com on 30 May 2013 at 10:09

GoogleCodeExporter commented 8 years ago
We'll add the new score method, and deprecate the old one. This will cause 
problems for anyone who implemented a Classifier, but users of Classifiers 
should be okay.

Original comment by steven.b...@gmail.com on 25 Jun 2013 at 6:05

GoogleCodeExporter commented 8 years ago
This issue was closed by revision 3f9aa70f713a.

Original comment by steven.b...@gmail.com on 24 Jul 2013 at 1:51

GoogleCodeExporter commented 8 years ago
We decided not to deprecate, and just to fix the method signature. It is 
ClearTK 2.0 after all.

Note that this change removed the ScoredOutcome class. There is no replacement. 
Use the Map<OUTCOME_TYPE, Double> instead.

Original comment by steven.b...@gmail.com on 24 Jul 2013 at 1:53