tmills / ctakes-docker

Apache License 2.0
23 stars 18 forks source link

Collection Reader Issue + Interesting Story #6

Closed MatthewVita closed 6 years ago

MatthewVita commented 6 years ago

Hi Tim,

Hope you are well.

So I'll start with the good news. I made a YouTube video overviewing how to set up the cTAKES Docker solution for my volunteer team on the OpenEMR project. I expected like ~3 views on the video... it's up to 130 views and a few people have reached out to me saying that it's the easiest way to get up and running. There is even a gentleman playing around with it for his hospital system! Good YouTube SEO I guess!

Now for the not so good news. I'm trying to run the collection reader without the MIST deid stuff and am running into a lot of errors.

Here's the command I'm running:

./bin/runRemoteAsyncAE.sh tcp://10.0.2.15:61616 mainQueue -d desc/localDeploymentDescriptorNoDeid.xml -c desc/FilesInDirectoryCollectionReader.xml -o xmis/

Here localDeploymentDescriptorNoDeid.xml:

<?xml version="1.0" encoding="UTF-8"?>
<analysisEngineDeploymentDescription xmlns="http://uima.apache.org/resourceSpecifier">
  <name>coordinatingDescriptor</name>
  <description/>
  <version>1.0</version>
  <vendor/>
  <deployment protocol="jms" provider="activemq">
    <casPool numberOfCASes="1" initialFsHeapSize="2000000"/>
    <service>
      <inputQueue endpoint="mainQueue" prefetch="0"/>
      <topDescriptor>
          <import location="remoteNoDeid.xml"/>
      </topDescriptor>
      <analysisEngine async="true">
        <scaleout numberOfInstances="1"/>

        <delegates>
            <remoteAnalysisEngine key="remoteFastDescriptor">
             <inputQueue endpoint="mainQueue" />
            </remoteAnalysisEngine>
        </delegates>
        <asyncPrimitiveErrorConfiguration>
          <processCasErrors thresholdCount="0" thresholdWindow="0" thresholdAction="terminate"/>
          <collectionProcessCompleteErrors timeout="0" additionalErrorAction="terminate"/>
      </asyncPrimitiveErrorConfiguration>
      </analysisEngine>
    </service>
  </deployment>
</analysisEngineDeploymentDescription>

Here's remoteFastDescriptor.xml

<?xml version="1.0" encoding="UTF-8"?>
<analysisEngineDescription xmlns="http://uima.apache.org/resourceSpecifier">
  <frameworkImplementation>org.apache.uima.java</frameworkImplementation>
  <primitive>false</primitive>
  <delegateAnalysisEngineSpecifiers>
    <delegateAnalysisEngine key="docker-fast-dictionary">
      <import location="/home/matthew/ctakes-docker/desc/docker-fast-dictionary.xml"/>
    </delegateAnalysisEngine>
  </delegateAnalysisEngineSpecifiers>
  <analysisEngineMetaData>
    <name>remoteFastDescriptor</name>
    <description/>
    <version>1.0</version>
    <vendor/>
    <configurationParameters/>
    <configurationParameterSettings/>
    <flowConstraints>
      <fixedFlow>
        <node>docker-fast-dictionary</node>
      </fixedFlow>
    </flowConstraints>
    <fsIndexCollection/>
    <capabilities>
      <capability>
        <inputs/>
        <outputs/>
        <languagesSupported/>
      </capability>
    </capabilities>
  <operationalProperties>
      <modifiesCas>true</modifiesCas>
      <multipleDeploymentAllowed>true</multipleDeploymentAllowed>
      <outputsNewCASes>false</outputsNewCASes>
    </operationalProperties>
  </analysisEngineMetaData>
  <resourceManagerConfiguration/>
</analysisEngineDescription>

I get a bunch of errors. One of which is Caused by: java.lang.ClassNotFoundException: org.apache.ctakes.core.cr.FilesInDirectoryCollectionReader, which is telling.

If I remove the collection read portion of the command, I see this error:

Attempting to deploy desc/localDeploymentDescriptorNoDeid.xml ...
Service:Aggregate with de-identification Initialized. Ready To Process Messages From Queue:mainQueue
UIMA AS Service Initialization Complete
Error on process CAS call to remote service:
org.apache.uima.aae.error.UimaEEServiceException: org.apache.uima.aae.error.UimaAsDelegateException: ----> Controller:/Aggregate with de-identification Received Exception  on CAS:-418c09cf:15e0100dd20:-7ffa From Delegate:remoteFastDescriptor
MatthewVita commented 6 years ago
        <delegates>
            <remoteAnalysisEngine key="remoteFastDescriptor">
             <inputQueue endpoint="mainQueue" />
            </remoteAnalysisEngine>
        </delegates>

Originally had myQueueName. It should be mainQueue, right?

tmills commented 6 years ago

This is probably a deficiency in my documentation. To run that collection reader you need to have ctakes jars in your uima classpath so it can see that ctakes class. Prepend the command with: UIMA_CLASSPATH=/path/to/ctakes-bin/lib and then UIMA will look there for jars to load. That explains the first error, maybe retry and see if that helps?

MatthewVita commented 6 years ago

Okay sir. I appreciate the response. I will try this tomorrow after work.

BTW, I have been working on a data parser and modern web UI for the cTAKES output. I think you'll find it to be very interesting/useful. I will post more information and a demo in the coming days!

MatthewVita commented 6 years ago

Hi Tim,

Hope you are doing well.

I am happy to say that the UIMA_CLASSPATH variable worked like a charm. I have updated the documentation and put in a PR here: https://github.com/tmills/ctakes-docker/pull/7

BTW, I wanted to share with you the parser I am working on. The idea is that it will eventually be at the end of the pipeline as a subscriber somehow: https://github.com/MatthewVita/cTAKES-Concept-Mention-Parser

-m

tmills commented 6 years ago

Great. I've added the info to the relevant section of the readme, so I'll consider this closed. Thanks for the feedback.

MatthewVita commented 6 years ago

Hi Tim. Did you see my PR?

On Fri, Aug 25, 2017, 7:17 AM Tim Miller notifications@github.com wrote:

Closed #6 https://github.com/tmills/ctakes-docker/issues/6.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/tmills/ctakes-docker/issues/6#event-1221307019, or mute the thread https://github.com/notifications/unsubscribe-auth/ABosBt0t03aYbxLYkN2eTGQO3AAe1jS6ks5sbq0-gaJpZM4O8sqo .

tmills commented 6 years ago

No, sorry, I missed that. I will undo the commit and grab it. Tim

On Fri, Aug 25, 2017 at 8:08 AM Matthew Vita notifications@github.com wrote:

Hi Tim. Did you see my PR?

On Fri, Aug 25, 2017, 7:17 AM Tim Miller notifications@github.com wrote:

Closed #6 https://github.com/tmills/ctakes-docker/issues/6.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/tmills/ctakes-docker/issues/6#event-1221307019, or mute the thread < https://github.com/notifications/unsubscribe-auth/ABosBt0t03aYbxLYkN2eTGQO3AAe1jS6ks5sbq0-gaJpZM4O8sqo

.

— You are receiving this because you modified the open/close state.

Reply to this email directly, view it on GitHub https://github.com/tmills/ctakes-docker/issues/6#issuecomment-324900674, or mute the thread https://github.com/notifications/unsubscribe-auth/AAN3DTJhdccczNmQkq-2vkjjHuy19SiVks5sbrk9gaJpZM4O8sqo .

MatthewVita commented 6 years ago

Thank you sir!