cleartk-eval is too complicated

GoogleCodeExporter commented 9 years ago

The cleartk-eval project is still too complicated. It forces you to split into 
a large number of classes and methods what is conceptually only three things:

(1) How you create CollectionReaders for your corpus
(2) How you train a model given a CollectionReader
(3) How you test a model given a CollectionReader

The cleartk-eval project also currently makes it quite difficult to do things 
like grid-search over your parameters, because the testing methods like 
runCrossValidation don't return anything.

I propose that instead, we provide an API something like:

public abstract class Evaluation_ImplBase<ITEM_TYPE, STATS_TYPE> {
  protected abstract CollectionReader getCollectionReader(List<ITEM_TYPE> items);
  protected abstract void train(CollectionReader collectionReader, File directory);
  protected abstract STATS_TYPE test(CollectionReader collectionReader, File directory);

  public Evaluation_ImplBase(File baseDirectory) { ... }
  public STATS_TYPE trainAndTest(List<ITEM_TYPE> trainItems, List<ITEM_TYPE> testItems) { ... }
  public List<STATS_TYPE> crossValidation(List<ITEM_TYPE> items, int folds) { ... }
}

As a user, you just implement getCollectionReader, train and test. The 
Evaluation_ImplBase class uses these to provide trainAndTest, crossValidation, 
etc. Reasons I think this structure is much better:

* This API puts together things that are already tightly coupled. Writing 
training data and training the classifier are a single method, train, and 
making classifier predictions and scoring them are another method, test. The 
current API splits all these things into different methods in different 
classes, but in my experience they're so tightly coupled that it's impossible 
to write one without looking back and forth to the other.

* This API makes it easy to do things like grid-search because you have direct 
access to the evaluation statistics. The API doesn't assume your evaluation is 
written in the form of an AnalysisEngine, so you can just write the evaluation 
code directly in the method and return the result, rather than writing 
everything to disk (though you can still do that if you want).

* This API would also address Issue 260 (multipass pipelines) and Issue 230 
(additional preprocessing), because you have much more flexibility in how you 
train and test. No extra methods needed!

What do you guys think?

Original issue reported on code.google.com by steven.b...@gmail.com on 1 May 2012 at 10:57

GoogleCodeExporter commented 9 years ago

So in this approach, you would be responsible for making calls to 
SimplePipeline.run() in your train and test methods?  I actually much prefer 
this over the pipeline providers as I often get lost as to what the pipeline 
looks like.    It would be really nice if whatever pipeline I'm running in the 
eval could also be shared with a standalone training main class.  That isn't 
possible with the existing eval, do you think it would be with this?

What would a STATS_TYPE class look like?  Is it pretty much any object you want 
to stuff results into?  Could it work with the existing fcollections?

At first pass I like this idea.  Perhaps we should test out any prototypes on 
the tf*idf document classification example.

Original comment by lee.becker on 1 May 2012 at 4:42

GoogleCodeExporter commented 9 years ago

Yep, you call the SimplePipeline.run() yourself. It should make it much easier 
to see what the pipeline really looks like.

I'm working on translating the cleartk-timeml evaluations to this paradigm. So 
far, it looks good, but I'll report back when I've finished.

Original comment by steven.b...@gmail.com on 1 May 2012 at 4:59

GoogleCodeExporter commented 9 years ago

Oh, and the STATS_TYPE could be anything you want, but on the simplest version, 
it would probably look like the EvaluationStatistics we have in the relation 
extraction code.

Original comment by steven.b...@gmail.com on 1 May 2012 at 5:01

GoogleCodeExporter commented 9 years ago

In r3897, I committed a version of this, and in r3898 applied it to 
cleartk-timeml. I've also applied it to the SHARPn relation-extractor in 
cTAKES, which is really a great illustration of how much cleaner this API is.

Before:

http://ohnlp.svn.sourceforge.net/viewvc/ohnlp/branches/SHARPn-cTAKES/relation-ex
tractor/src/org/chboston/cnlp/ctakes/relationextractor/eval/RelationExtractorEva
luation.java?revision=798

After:

http://ohnlp.svn.sourceforge.net/viewvc/ohnlp/branches/SHARPn-cTAKES/relation-ex
tractor/src/org/chboston/cnlp/ctakes/relationextractor/eval/RelationExtractorEva
luation.java?revision=838

Note that in the after, I'm not showing all the other whole classes we got to 
delete along with the edits in RelationExtractorEvaluation. The new APIs let us 
get rid of CorpusReaderProvider_ImplBase, XMICorpusReaderProvider, 
DegreeOfRelationExtractorPipelineProvider, 
EntityMentionPairRelationExtractorPipelineProvider and 
RelationExtractionPipelineProvider.

For me, the key difference is that it's now very clear what the training 
procedure looks like and what the testing procedure looks like. In the old 
version, the training procedure was partly hidden in the cleartk-eval code and 
partly in the the pipeline provider defined, while the testing procedure was 
partly in the classification pipeline and partly in the evaluation pipeline.

Also note how easy it is with the new API to support grid search - no need to 
save and load things from odd places on the file system.

I would like to deprecate all the old cleartk-eval APIs. What do you guys think?

Original comment by steven.b...@gmail.com on 2 May 2012 at 6:51

Changed state: Started

GoogleCodeExporter commented 9 years ago

This issue was closed by revision r3901.

Original comment by steven.b...@gmail.com on 3 May 2012 at 9:35

Changed state: Fixed

GoogleCodeExporter commented 9 years ago

I went ahead and deprecated the old APIs. If anyone really thinks that was a 
bad idea, we can back this change out.

Original comment by steven.b...@gmail.com on 3 May 2012 at 9:36

GoogleCodeExporter commented 9 years ago

Original comment by steven.b...@gmail.com on 5 Aug 2012 at 8:49

Added labels: Milestone-1.2, Component-eval

noushadali / cleartk

cleartk-eval is too complicated #304