openminted / Open-Call-Discussions

A central place for participants in the open calls to ask questions
2 stars 1 forks source link

SSH UC NER - Hackathon #39

Closed reckart closed 6 years ago

reckart commented 6 years ago

Trying to get the Stanford NER running with the UC SSH NER model....

reckart commented 6 years ago

I can find this component when searching for components on test.openminted.eu: https://test.openminted.eu/landingPage/component/452f72e9-4bf4-4987-960e-3b9e51e4d583

But when I create a new workflow, the component is not available in the sidebar...

reckart commented 6 years ago

@galanisd is it not possible anymore to register DKPro Core SNAPSHOT artifacts via Maven on test.openminted.eu?

greenwoodma commented 6 years ago

test.openminted.eu seems to be completely broken for registering components and building workflows at the moment, as the metadata pages won't load properly.

reckart commented 6 years ago

Well, I could register a released component from DKPro Core 1.9.1 - but I have made some enhancements in 1.9.2-SNAPSHOT and would like to test them.

galanisd commented 6 years ago

test.openminted.eu seems to be completely broken for registering components and building workflows at the moment, as the metadata pages won't load properly.

I just registered "Compound Annotator" DKPro Core 1.9.1 via "upload XML". Everything OK.

Before that I registered a Docker-based component. No issues also.

Yes at some point today Registry was slow and had issues. However, in the last hour I fixed a bug in Galaxy wrapper generation and redeployed test.openminted.eu.

reckart commented 6 years ago

@galanisd Didn't you say that the UKP SNAPSHOT Maven repos are also accessible by the registry/workflow execution?

galanisd commented 6 years ago

From OMTD workflow execution yes.

From Registry not sure. If you add a component by resolving coordinates in OMTD Registry and fails probably this means that it doesn't see UKP SNAPSHOT Maven repo.

reckart commented 6 years ago

I seem to be unable to run workflows on test.openminted.eu. The docker logs for the PDFReader step show that some odd identifier is used instead of the class name:

Caused by: java.lang.ClassNotFoundException: 9e6af3e2-7c7e-4809-a9b2-3bebe1245ba1
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381) ~[na:1.8.0_161]
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424) ~[na:1.8.0_161]
    at org.springframework.boot.loader.LaunchedURLClassLoader.loadClass(LaunchedURLClassLoader.java:94) ~[omtd-component-uima-0.0.1-SNAPSHOT-exec.jar:0.0.1-SNAPSHOT]
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ~[na:1.8.0_161]
    at java.lang.Class.forName0(Native Method) ~[na:1.8.0_161]
    at java.lang.Class.forName(Class.java:264) ~[na:1.8.0_161]
    at eu.openminted.workflows.uima.executor.UIMAFitRunner.uimaFitRun(UIMAFitRunner.java:70) ~[classes!/:0.0.1-SNAPSHOT]
    at eu.openminted.workflows.uima.executor.PipelineCommandLineRunner.run(PipelineCommandLineRunner.java:41) [classes!/:0.0.1-SNAPSHOT]
    at org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:800) ~[spring-boot-1.4.2.RELEASE.jar!/:1.4.2.RELEASE]
    ... 12 common frames omitted
greenwoodma commented 6 years ago

That's the OMTD id for the workflow, no idea why that's being used instead of the class name.

reckart commented 6 years ago

Btw. the odd identifier is the registry ID of the PDF reader. Maybe the OMTD UIMA wrapper tries to pick up the class name from the wrong identifier? There are two in the OMTD-SHARE file:

<ms:resourceIdentifiers>
  <ms:resourceIdentifier resourceIdentifierSchemeName="OMTD">9e6af3e2-7c7e-4809-a9b2-3bebe1245ba1</ms:resourceIdentifier>
  <ms:resourceIdentifier resourceIdentifierSchemeName="maven">mvn:de.tudarmstadt.ukp.dkpro.core:de.tudarmstadt.ukp.dkpro.core.io.pdf-asl:1.9.2-SNAPSHOT#de.tudarmstadt.ukp.dkpro.core.io.pdf.PdfReader</ms:resourceIdentifier>
</ms:resourceIdentifiers>
galanisd commented 6 years ago

For UIMA/GATE wrapper generation we get the first resourceIdentifier. When it was developed there was only one resourceIdentifier.

The OMTD resourceIdentifier was added by you or it was added automatically from the Registry?

reckart commented 6 years ago

It was added by the registry.

Btw, where is the UIMA wrapper code? I don't find the github repo anymore.

reckart commented 6 years ago

@galanisd the class name should be picked up from <ns1:command>de.tudarmstadt.ukp.dkpro.core.io.pdf.PdfReader</ns1:command> not be parsed out of the Maven identifier, no?

gkirtzou commented 6 years ago

It must have been added by the registry. Every resource should have a unique omtd Id within the platform.

On Tue, 22 May 2018, 18:06 Dimitrios Galanis, notifications@github.com wrote:

For UIMA/GATE wrapper generation we get the first resourceIdentifier. When it was developed there was only one resourceIdentifier.

The OMTD resourceIdentifier was added by you or it was added automatically from the Registry?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/openminted/Open-Call-Discussions/issues/39#issuecomment-391025678, or mute the thread https://github.com/notifications/unsubscribe-auth/AICS75lGwGbo4Ui40NQovPN3u_Ce1NLbks5t1ClqgaJpZM4ThbeK .

galanisd commented 6 years ago

Btw, where is the UIMA wrapper code? I don't find the github repo anymore.

https://github.com/openminted/omtd-component-executor/tree/master/omtd-component-galaxy

It was added by the registry.

I will update the code to use the maven resourceIdentifier ASAP. I didn't know that this part has changed.

<ns1:command>de.tudarmstadt.ukp.dkpro.core.io.aclanthology.AclAnthologyReader</ns1:command> node be parsed out of the Maven identifier, no?

We parse the maven id. There was no command when the code was written. It will not make any difference. Command element also is not very useful in the case of UIMA/GATE since the command that we use in Galaxy wrappers in not known to those who add such components. We (OMTD) have generated this command. E.g. runUIMA.sh <className> <coordinates> Command element is useful in the case of Docker components.

reckart commented 6 years ago

@galanisd IMHO the command would be a convenient way to get the class name without having to parse identifiers.

reckart commented 6 years ago

@galanisd anyway, so should the OMTD-SHARE Maven plugin omit the command element?

galanisd commented 6 years ago

@galanisd IMHO the command would be a convenient way to get the class name without having to parse identifiers.

Yes it is a little bit easier to use command than parsing the resourcedID. The only issues I see with that is a. that UIMA/GATE-component command is different thing than Docker-component command in OMTD-SHARE (java class vs. the name of an executable). b. we put the same info (java class) in various points (distributionLocation, command, resourceIdentifier).

No problem to keep it. @pennyl67 @greenwoodma what do you think? Whatever you think is more convenient to the people that use the OMTD-SHARE files.

In the cases of UIMA/GATE the code should look for the java class in the command element if it is not there the maven resourceID could be used.

galanisd commented 6 years ago

I will update the code to use the maven resourceIdentifier ASAP. I didn't know that this part has changed.

Fixed. Updated tests.opneminted.eu. Tested with mvn:de.tudarmstadt.ukp.dkpro.core:de.tudarmstadt.ukp.dkpro.core.io.bincas-asl:1.9.1-SNAPSHOT#de.tudarmstadt.ukp.dkpro.core.io.bincas.BinaryCasReader

2 resourceIDs in OMTD-SHARE (download from My Components).

`

92325c88-9601-4be6-800b-7409d69f281c mvn:de.tudarmstadt.ukp.dkpro.core:de.tudarmstadt.ukp.dkpro.core.io.bincas-asl:1.9.1-SNAPSHOT#de.tudarmstadt.ukp.dkpro.core.io.bincas.BinaryCasReader` Generated wrapper uses the correct one.
reckart commented 6 years ago

I tried re-running (since the fix is in the wrapper, I assume that I do not have to re-upload or re-build anything) my application but the failure persists on test.openminted.eu:

Error starting ApplicationContext. To display the auto-configuration report re-run your application with 'debug' enabled.
2018-05-22 19:02:41.927 ERROR 42 --- [           main] o.s.boot.SpringApplication               : Application startup failed

java.lang.IllegalStateException: Failed to execute CommandLineRunner
    at org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:803) ~[spring-boot-1.4.2.RELEASE.jar!/:1.4.2.RELEASE]
    at org.springframework.boot.SpringApplication.callRunners(SpringApplication.java:784) ~[spring-boot-1.4.2.RELEASE.jar!/:1.4.2.RELEASE]
    at org.springframework.boot.SpringApplication.afterRefresh(SpringApplication.java:771) ~[spring-boot-1.4.2.RELEASE.jar!/:1.4.2.RELEASE]
    at org.springframework.boot.SpringApplication.run(SpringApplication.java:316) ~[spring-boot-1.4.2.RELEASE.jar!/:1.4.2.RELEASE]
    at eu.openminted.workflows.uima.executor.PipelineCommandLineRunner.main(PipelineCommandLineRunner.java:29) [classes!/:0.0.1-SNAPSHOT]
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_161]
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_161]
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_161]
    at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_161]
    at org.springframework.boot.loader.MainMethodRunner.run(MainMethodRunner.java:48) [omtd-component-uima-0.0.1-SNAPSHOT-exec.jar:0.0.1-SNAPSHOT]
    at org.springframework.boot.loader.Launcher.launch(Launcher.java:87) [omtd-component-uima-0.0.1-SNAPSHOT-exec.jar:0.0.1-SNAPSHOT]
    at org.springframework.boot.loader.Launcher.launch(Launcher.java:50) [omtd-component-uima-0.0.1-SNAPSHOT-exec.jar:0.0.1-SNAPSHOT]
    at org.springframework.boot.loader.PropertiesLauncher.main(PropertiesLauncher.java:521) [omtd-component-uima-0.0.1-SNAPSHOT-exec.jar:0.0.1-SNAPSHOT]
Caused by: java.lang.ClassNotFoundException: 9e6af3e2-7c7e-4809-a9b2-3bebe1245ba1
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381) ~[na:1.8.0_161]
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424) ~[na:1.8.0_161]
    at org.springframework.boot.loader.LaunchedURLClassLoader.loadClass(LaunchedURLClassLoader.java:94) ~[omtd-component-uima-0.0.1-SNAPSHOT-exec.jar:0.0.1-SNAPSHOT]
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ~[na:1.8.0_161]
    at java.lang.Class.forName0(Native Method) ~[na:1.8.0_161]
    at java.lang.Class.forName(Class.java:264) ~[na:1.8.0_161]
    at eu.openminted.workflows.uima.executor.UIMAFitRunner.uimaFitRun(UIMAFitRunner.java:70) ~[classes!/:0.0.1-SNAPSHOT]
    at eu.openminted.workflows.uima.executor.PipelineCommandLineRunner.run(PipelineCommandLineRunner.java:41) [classes!/:0.0.1-SNAPSHOT]
    at org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:800) ~[spring-boot-1.4.2.RELEASE.jar!/:1.4.2.RELEASE]
    ... 12 common frames omitted

2018-05-22 19:02:41.932  INFO 42 --- [           main] s.c.a.AnnotationConfigApplicationContext : Closing org.springframework.context.annotation.AnnotationConfigApplicationContext@5f4da5c3: startup date [Tue May 22 19:02:39 UTC 2018]; root of context hierarchy
2018-05-22 19:02:41.935  INFO 42 --- [           main] o.s.j.e.a.AnnotationMBeanExporter        : Unregistering JMX-exposed beans on shutdown

Maybe it is because you are caching the Docker images? Can you flush them?

reckart commented 6 years ago

Btw: node2 ID mesos-f5a9d178-5fca-4974-a360-212d93ab4664

galanisd commented 6 years ago

Maybe it is because you are caching the Docker images? Can you flush them?

Nope.

I assume that it has to do with the components that were registered when then code was generating "wrong" wrappers. The change I did does not solve the problem for them.

The solution is

reckart commented 6 years ago

@galanisd ok, I have re-uploaded the components and re-created the workflow.

I can see in the docker logs that all the components of the workflow have completed. However, the workflow is still listed as "running" even after several minutes of waiting. Mind that the workflow processed only 2 PDFs.

Any way to figure out where it is stuck now?

reckart commented 6 years ago

@gkirtzou @galanisd I can also see in the Galaxy UI of test.openminted.eu that the workflow has completed over 30 minutes ago, but in the registry it still shows as "running" :(

gkirtzou commented 6 years ago

@reckart check your email

gkirtzou commented 6 years ago

@reckart the registry keeps the operation of a workflow to the "running" status, even though the workflow has successfully finished in the workflow engine, due to a bug in the registry side. The registry fails to register the new annotated corpus, and thus stalling the update of the operation as well. @antleb already knows this, and a fix is being expected.

reckart commented 6 years ago

@gkirtzou ok, thanks :) Is this bug being tracked somewhere and can be watched?

gkirtzou commented 6 years ago

@reckart I forgot to create an issue in redmine. I will create one there and I will add you as watcher.

reckart commented 6 years ago

Finally, I have been able to run CoreNLP NER with a custom model :) (although I could only verify it through hidden channels because of the "running" bug).

So, this can be closed now. Next: preparing DKPro Core 1.9.2 release and registering some components on the OMTD production platform.

reckart commented 6 years ago

@gkirtzou @galanisd @greenwoodma Thanks for all the support!