ashokpant / dkpro-tc

Automatically exported from code.google.com/p/dkpro-tc
Other
0 stars 0 forks source link

Experiments should fail with a meaningful exception if no data are found #139

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?

1. Run the TwentyNewsgroupsDemo in a working directory different from the 
correct one (examples-gpl).
2. It will fail with the following exception (because the documents were not 
located)

Exception in thread "main" 
de.tudarmstadt.ukp.dkpro.lab.engine.ExecutionException: 
de.tudarmstadt.ukp.dkpro.lab.engine.ExecutionException: 
java.lang.IllegalStateException: Requested [3] folds, but only got [0] values. 
There must be at least as many values as folds.
    at de.tudarmstadt.ukp.dkpro.lab.engine.impl.ExecutableTaskEngine.run(ExecutableTaskEngine.java:68)
    at de.tudarmstadt.ukp.dkpro.lab.engine.impl.DefaultTaskExecutionService.run(DefaultTaskExecutionService.java:48)
    at de.tudarmstadt.ukp.dkpro.lab.Lab.run(Lab.java:97)
    at de.tudarmstadt.ukp.dkpro.tc.examples.single.document.TwentyNewsgroupsDemo.runCrossValidation(TwentyNewsgroupsDemo.java:135)
    at de.tudarmstadt.ukp.dkpro.tc.examples.single.document.TwentyNewsgroupsDemo.main(TwentyNewsgroupsDemo.java:61)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at com.intellij.rt.execution.application.AppMain.main(AppMain.java:134)
Caused by: de.tudarmstadt.ukp.dkpro.lab.engine.ExecutionException: 
java.lang.IllegalStateException: Requested [3] folds, but only got [0] values. 
There must be at least as many values as folds.
    at de.tudarmstadt.ukp.dkpro.lab.engine.impl.ExecutableTaskEngine.run(ExecutableTaskEngine.java:68)
    at de.tudarmstadt.ukp.dkpro.lab.task.impl.BatchTask.execute(BatchTask.java:207)
    at de.tudarmstadt.ukp.dkpro.tc.weka.task.BatchTaskCrossValidation.execute(BatchTaskCrossValidation.java:212)
    at de.tudarmstadt.ukp.dkpro.lab.engine.impl.ExecutableTaskEngine.run(ExecutableTaskEngine.java:55)
    ... 9 more
Caused by: java.lang.IllegalStateException: Requested [3] folds, but only got 
[0] values. There must be at least as many values as folds.
    at de.tudarmstadt.ukp.dkpro.lab.task.impl.FoldDimensionBundle.init(FoldDimensionBundle.java:116)
    at de.tudarmstadt.ukp.dkpro.lab.task.impl.FoldDimensionBundle.rewind(FoldDimensionBundle.java:147)
    at de.tudarmstadt.ukp.dkpro.lab.task.ParameterSpace$ParameterSpaceIterator.step(ParameterSpace.java:151)
    at de.tudarmstadt.ukp.dkpro.lab.task.ParameterSpace$ParameterSpaceIterator.<init>(ParameterSpace.java:142)
    at de.tudarmstadt.ukp.dkpro.lab.task.ParameterSpace.iterator(ParameterSpace.java:125)
    at de.tudarmstadt.ukp.dkpro.lab.task.impl.BatchTask.execute(BatchTask.java:146)
    at de.tudarmstadt.ukp.dkpro.tc.weka.task.BatchTaskCrossValidation$1.execute(BatchTaskCrossValidation.java:143)
    at de.tudarmstadt.ukp.dkpro.lab.engine.impl.ExecutableTaskEngine.run(ExecutableTaskEngine.java:55)
    ... 12 more

What is the expected output? What do you see instead?

- Would be helpful to see some meaningful exception, similar to 
TwentyNewsgroupsRaw:

Information: Found [0] resources to be read
Exception in thread "main" 
org.apache.uima.analysis_engine.AnalysisEngineProcessException
    at de.tudarmstadt.ukp.dkpro.tc.core.task.uima.ExtractFeaturesConnector.collectionProcessComplete(ExtractFeaturesConnector.java:139)
    at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.collectionProcessComplete(PrimitiveAnalysisEngine_impl.java:331)
    at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.collectionProcessComplete(AggregateAnalysisEngine_impl.java:336)
    at org.apache.uima.fit.pipeline.SimplePipeline.runPipeline(SimplePipeline.java:88)
    at org.apache.uima.fit.pipeline.SimplePipeline.runPipeline(SimplePipeline.java:115)
    at de.tudarmstadt.ukp.dkpro.tc.examples.raw.TwentyNewsgroupsRaw.main(TwentyNewsgroupsRaw.java:34)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at com.intellij.rt.execution.application.AppMain.main(AppMain.java:134)
Caused by: java.lang.IllegalArgumentException: List of instance outcomes is 
empty.
    at de.tudarmstadt.ukp.dkpro.tc.weka.util.WekaUtils.instanceListToArffFile(WekaUtils.java:223)
    at de.tudarmstadt.ukp.dkpro.tc.weka.writer.WekaDataWriter.write(WekaDataWriter.java:26)
    at de.tudarmstadt.ukp.dkpro.tc.core.task.uima.ExtractFeaturesConnector.collectionProcessComplete(ExtractFeaturesConnector.java:136)
    ... 10 more

or some even better (no training/test instances found/loaded).

What version of the product are you using? On what operating system?

- trunk

Please provide any additional information below.

- reaction to issue #138

Original issue reported on code.google.com by ivan.hab...@gmail.com on 6 Jun 2014 at 1:12

GoogleCodeExporter commented 9 years ago
I suggest throwing an Exception in a different place for both examples.  The 
Demo example above throws an exception when splitting the instances into folds 
for CV.  The Raw example throws an exception when writing arff files for Weka, 
when it notices the list of instance outcomes is empty.  Both Exceptions are 
distant downstream effects of a Reader that found 0 resources.

Perhaps we can throw an Exception somewhere in the Reader inheritance family.  
I will query the DKPro Core mailing list about the proper place to do this, TC 
or Core, and which class.

Original comment by EmilyKJa...@gmail.com on 17 Jun 2014 at 3:00

GoogleCodeExporter commented 9 years ago
IMHO it is perfectly acceptable that a reader produces 0 results. Why can TC 
not just gracefully handle that? If there is no input data, then there should 
be no results - not an exception - my 10 cents.

Original comment by richard.eckart on 17 Jun 2014 at 3:06

GoogleCodeExporter commented 9 years ago
Said otherwise: in my opinion, experiments shouldn't fail at all if no data was 
found.

Original comment by richard.eckart on 17 Jun 2014 at 3:07

GoogleCodeExporter commented 9 years ago
TC should definitely handle that and just (as was already said) gracefully and 
with a meaningful error message terminate the experiment.

Original comment by torsten....@gmail.com on 17 Jun 2014 at 3:13

GoogleCodeExporter commented 9 years ago
Possibly the pre-process task could check at its end if any data has been 
written to its output location.

Original comment by richard.eckart on 17 Jun 2014 at 3:30

GoogleCodeExporter commented 9 years ago
There seems to be no collectionProcessComplete()-type method anywhere in the 
Reader inheritance.  I think we would need to throw an Exception elsewhere in 
TC, besides the Reader.  But, there's got to be a better place before all the 
preprocessing has been attempted, right?

-Emily

Original comment by EmilyKJa...@gmail.com on 17 Jun 2014 at 4:38

GoogleCodeExporter commented 9 years ago
Well, the preprocessing won't take too long, if there is no data... So, why not 
check at the end?

Original comment by daxenber...@gmail.com on 17 Jun 2014 at 4:46

GoogleCodeExporter commented 9 years ago

Original comment by daxenber...@gmail.com on 4 Jul 2014 at 9:37

GoogleCodeExporter commented 9 years ago

Original comment by torsten....@gmail.com on 11 Sep 2014 at 7:33

GoogleCodeExporter commented 9 years ago
This issue was closed by revision r1101.

Original comment by torsten....@gmail.com on 12 Sep 2014 at 1:55

GoogleCodeExporter commented 9 years ago

Original comment by daxenber...@gmail.com on 1 Apr 2015 at 5:10