inception-project / external-recommender-dkpro-tc

External recommender for DKPro TC
Apache License 2.0
0 stars 0 forks source link

#2 - Update request format to newest version #3

Closed jcklie closed 5 years ago

jcklie commented 5 years ago
jcklie commented 5 years ago

If I run the unit tests, then I get for the RoundTripTest:

Nov 27, 2018 3:24:27 PM org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl callAnalysisComponentProcess(434)
SEVERE: Exception occurred
org.apache.uima.analysis_engine.AnalysisEngineProcessException
    at org.dkpro.tc.core.task.uima.InstanceExtractor.getInstances(InstanceExtractor.java:83)
    at org.dkpro.tc.core.task.uima.ExtractFeaturesConnector.process(ExtractFeaturesConnector.java:168)
    at org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCasAnnotator_ImplBase.java:48)
    at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:401)
    at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:318)
    at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:269)
    at org.dkpro.lab.uima.engine.simple.SimpleExecutionEngine.run(SimpleExecutionEngine.java:138)
    at org.dkpro.lab.engine.impl.BatchTaskEngine.runNewExecution(BatchTaskEngine.java:346)
    at org.dkpro.lab.engine.impl.BatchTaskEngine.executeConfiguration(BatchTaskEngine.java:241)
    at org.dkpro.lab.engine.impl.BatchTaskEngine.run(BatchTaskEngine.java:133)
    at org.dkpro.lab.engine.impl.DefaultTaskExecutionService.run(DefaultTaskExecutionService.java:52)
    at org.dkpro.lab.Lab.run(Lab.java:113)
    at org.dkpro.tc.ml.builder.ExperimentBuilder.run(ExperimentBuilder.java:675)
    at de.unidue.ltl.recommender.core.train.TrainNewModel.startTraining(TrainNewModel.java:98)
    at de.unidue.ltl.recommender.core.train.TrainNewModel.run(TrainNewModel.java:55)
    at de.unidue.ltl.recommender.core.train.RoundTripTest.train(RoundTripTest.java:133)
    at de.unidue.ltl.recommender.core.train.RoundTripTest.roundTrip(RoundTripTest.java:75)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
    at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
    at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
    at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
    at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
    at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
    at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
    at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
    at org.junit.rules.RunRules.evaluate(RunRules.java:20)
    at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
    at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
    at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
    at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
    at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
    at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
    at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
    at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
    at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
    at com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47)
    at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
    at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
Caused by: java.lang.ArrayIndexOutOfBoundsException: 64945
    at org.apache.uima.cas.impl.CASImpl.getFeatureValue(CASImpl.java:2444)
    at org.apache.uima.cas.impl.CASImpl.getSofaFeat(CASImpl.java:887)
    at org.apache.uima.cas.impl.CASImpl.ll_getSofaCasView(CASImpl.java:5206)
    at org.apache.uima.jcas.cas.AnnotationBase.getView(AnnotationBase.java:85)
    at org.apache.uima.jcas.tcas.Annotation.getCoveredText(Annotation.java:123)
    at org.dkpro.tc.features.tcu.TargetSurfaceFormContextFeature.getTargetText(TargetSurfaceFormContextFeature.java:109)
    at org.dkpro.tc.features.tcu.TargetSurfaceFormContextFeature.extract(TargetSurfaceFormContextFeature.java:69)
    at org.dkpro.tc.core.task.uima.InstanceExtractor.getSparse(InstanceExtractor.java:374)
    at org.dkpro.tc.core.task.uima.InstanceExtractor.getSequenceInstances(InstanceExtractor.java:130)
    at org.dkpro.tc.core.task.uima.InstanceExtractor.getInstances(InstanceExtractor.java:69)
    ... 43 more

org.dkpro.lab.engine.ExecutionException: org.dkpro.lab.engine.ExecutionException: org.apache.uima.analysis_engine.AnalysisEngineProcessException

    at org.dkpro.lab.engine.impl.BatchTaskEngine.run(BatchTaskEngine.java:155)
    at org.dkpro.lab.engine.impl.DefaultTaskExecutionService.run(DefaultTaskExecutionService.java:52)
    at org.dkpro.lab.Lab.run(Lab.java:113)
    at org.dkpro.tc.ml.builder.ExperimentBuilder.run(ExperimentBuilder.java:675)
    at de.unidue.ltl.recommender.core.train.TrainNewModel.startTraining(TrainNewModel.java:98)
    at de.unidue.ltl.recommender.core.train.TrainNewModel.run(TrainNewModel.java:55)
    at de.unidue.ltl.recommender.core.train.RoundTripTest.train(RoundTripTest.java:133)
    at de.unidue.ltl.recommender.core.train.RoundTripTest.roundTrip(RoundTripTest.java:75)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
    at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
    at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
    at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
    at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
    at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
    at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
    at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
    at org.junit.rules.RunRules.evaluate(RunRules.java:20)
    at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
    at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
    at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
    at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
    at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
    at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
    at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
    at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
    at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
    at com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47)
    at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
    at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
Caused by: org.dkpro.lab.engine.ExecutionException: org.apache.uima.analysis_engine.AnalysisEngineProcessException
    at org.dkpro.lab.uima.engine.simple.SimpleExecutionEngine.run(SimpleExecutionEngine.java:175)
    at org.dkpro.lab.engine.impl.BatchTaskEngine.runNewExecution(BatchTaskEngine.java:346)
    at org.dkpro.lab.engine.impl.BatchTaskEngine.executeConfiguration(BatchTaskEngine.java:241)
    at org.dkpro.lab.engine.impl.BatchTaskEngine.run(BatchTaskEngine.java:133)
    ... 34 more
Caused by: org.apache.uima.analysis_engine.AnalysisEngineProcessException
    at org.dkpro.tc.core.task.uima.InstanceExtractor.getInstances(InstanceExtractor.java:83)
    at org.dkpro.tc.core.task.uima.ExtractFeaturesConnector.process(ExtractFeaturesConnector.java:168)
    at org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCasAnnotator_ImplBase.java:48)
    at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:401)
    at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:318)
    at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:269)
    at org.dkpro.lab.uima.engine.simple.SimpleExecutionEngine.run(SimpleExecutionEngine.java:138)
    ... 37 more
Caused by: java.lang.ArrayIndexOutOfBoundsException: 64945
    at org.apache.uima.cas.impl.CASImpl.getFeatureValue(CASImpl.java:2444)
    at org.apache.uima.cas.impl.CASImpl.getSofaFeat(CASImpl.java:887)
    at org.apache.uima.cas.impl.CASImpl.ll_getSofaCasView(CASImpl.java:5206)
    at org.apache.uima.jcas.cas.AnnotationBase.getView(AnnotationBase.java:85)
    at org.apache.uima.jcas.tcas.Annotation.getCoveredText(Annotation.java:123)
    at org.dkpro.tc.features.tcu.TargetSurfaceFormContextFeature.getTargetText(TargetSurfaceFormContextFeature.java:109)
    at org.dkpro.tc.features.tcu.TargetSurfaceFormContextFeature.extract(TargetSurfaceFormContextFeature.java:69)
    at org.dkpro.tc.core.task.uima.InstanceExtractor.getSparse(InstanceExtractor.java:374)
    at org.dkpro.tc.core.task.uima.InstanceExtractor.getSequenceInstances(InstanceExtractor.java:130)
    at org.dkpro.tc.core.task.uima.InstanceExtractor.getInstances(InstanceExtractor.java:69)
    ... 43 more

Does anyone have an idea what is happening there?

reckart commented 5 years ago

Jenkins, can you test this please?

reckart commented 5 years ago

Jenkins, can you test this please?

reckart commented 5 years ago

Does anyone have an idea what is happening there?

To me it looks like the CAS may be reinitialized and some JCas objects are being carried over the reinitialization.

reckart commented 5 years ago

Actually, I believe it is a different problem related from the switch of BASE64 to plain-string, namely that the document text gets garbled because of some encoding problem.

reckart commented 5 years ago

... but that wouldn't break here:

  public int getFeatureValue(int addr, int feat) {
    return this.getHeap().heap[(addr + this.svd.casMetadata.featureOffset[feat])];
  }

A bad encoding in the sofa string shouldn't corrupt the heap...

jcklie commented 5 years ago

We do not deal with bytes anymore, except for what UIMA with writing/reading CAS/type systems from disk does. The sofaString also looked ok. It has some newlines in it, but I hope that that is not a problem.

reckart commented 5 years ago

Hm... actually it seems that the CAS on which the code is failing doesn't even come from the outside. The code fails on a dummy CAS which is just created in order to extract feature names!?

Thread [main] (Suspended (breakpoint at line 123 in Annotation))    
    TextClassificationTarget(Annotation).getCoveredText() line: 123 
    TargetSurfaceFormContextFeature.getTargetText(Integer) line: 109    
    TargetSurfaceFormContextFeature.extract(JCas, TextClassificationTarget) line: 69    
    InstanceExtractor.getDense(JCas, TextClassificationTarget, FeatureExtractorResource_ImplBase) line: 367 
    InstanceExtractor.getSequenceInstances(JCas, boolean) line: 133 
    InstanceExtractor.getInstances(JCas, boolean) line: 69  
    ExtractFeaturesConnector.getFeatureNames(JCas) line: 243    << TWO LINES BEFORE THIS, A MOCK CAS IS CREATED
    ExtractFeaturesConnector.process(JCas) line: 162    
reckart commented 5 years ago

Hm... actually it seems that the CAS on which the code is failing doesn't even come from the outside. The code fails on a dummy CAS which is just created in order to extract feature names!?

Red herring - set line breakpoint instead of waiting for exception....

reckart commented 5 years ago

Jenkins, can you test this please?

reckart commented 5 years ago

Ok - to sum it up - DKPro TC really doesn't like it if the documentID of all CASes is the same. That's what is causing this esotheric error message - JCas classes are carried over across CAS resets if the documentIDs are the same.

Horsmann commented 5 years ago

Thanks for seeing the issue with the mock CAS.

I think the BinCasWriter uses also the document meta data to determine the file name of the cas that is written to disc. Except for that there shouldnt be other dependencies on the DMD info? Did you find more places where this is still used?

reckart commented 5 years ago

I think the BinCasWriter uses also the document meta data to determine the file name of the cas that is written to disc.

The writers use the documentId only if no documentUri and documentBaseUri are available or if they are forced to use the documentId.

Except for that there shouldnt be other dependencies on the DMD info? Did you find more places where this is still used?

org.dkpro.tc.features.tcu.TcuLookUpTable.isTheSameDocument(JCas aJCas) uses DMD to determine if two CASes are the same. We didn't search for other usages.

jcklie commented 5 years ago

@reckart Do you have an idea why the tests failed on Jenkins?

reckart commented 5 years ago

No idea. Since the test uses a @TemporaryFolder rule instead of writing the test outputs to target/test-output (cf. de.tudarmstadt.ukp.dkpro.core.testing.DkproTestContext.getTestOutputFolder()), it is also hard to tell if the models and/or any other output is generated or not.

Horsmann commented 5 years ago

Locally things do work? Then I would just merge the changes and get started?!

jcklie commented 5 years ago

[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO] 
[INFO] inception-recommender .............................. SUCCESS [  0.002 s]
[INFO] recommender-core ................................... SUCCESS [ 14.605 s]
[INFO] recommender-model-repository ....................... SUCCESS [  0.241 s]
[INFO] recommender-server ................................. SUCCESS [ 17.257 s]
reckart commented 5 years ago

Better not merge things that do not pass the Jenkins tests. Maybe something doesn't like it that there are spaces in the paths used on Jenkins!?

Horsmann commented 5 years ago

Hm, so how do we proceed with this one?

jcklie commented 5 years ago

Jenkins, can you test this please?

reckart commented 5 years ago

Jenkins, can you test this please?

@Rentier @Horsmann you are now also whitelisted on Jenkins

Horsmann commented 5 years ago

I created by hand a job on our Jenkins and the pull-request builds. I am not sure what the issue is but it seems to relate to UKP's Jenkins.

ukp-svc-jenkins commented 5 years ago

0% (0.0%) vs master 0%