dirkweissenborn / ctakes-server

A simple REST-server around ctakes clinical pipeline.
30 stars 15 forks source link

Processing time #4

Closed leanderme closed 7 years ago

leanderme commented 7 years ago

Hi,

I'm running ctakes 3.2.3 with and without ytex. The average processing time on my machine is ~ 5 - 15 min. / medical report. Running ctakes from command line is much faster and takes about 1 - 2 min.

I wondered if this is normal or if I misconfigured ctakes to run as rest server? I had to increase the max ram to 6GB with no significant changes in processing time.

AE used (without ytex): ctakes-clinical-pipeline/desc/analysis_engine/AggregatePlaintextUMLSProcessor.xml. couldn't get AggregatePlaintextFastUMLSProcessor.xml working as it can't find org.apache.ctakes.dictionary.lookup2.ae.DefaultJCasTermAnnotator for some reason.

AE used (with ytex):

desc/ctakes-ytex-uima/desc/analysis_engine/AggregatePlaintextUMLSProcessor.xml

with

<delegateAnalysisEngine key="DictionaryLookupAnnotatorDB"> 
    <!--import location="./DictionaryLookupAnnotator.xml" /--> 
    <import location="../../../ctakes-clinical-pipelinefast/desc/analysis_engine/AggregatePlaintextUMLSProcessor.xml"/> 
</delegateAnalysisEngine>

I couldn't get

<delegateAnalysisEngine key="DictionaryLookupAnnotatorDB"> 
    <!--import location="./DictionaryLookupAnnotator.xml" /--> 
    <import location="../../../ctakes-dictionary-lookup-fast/desc/analysis_engine/UmlsLookupAnnotator.xml"/> 
</delegateAnalysisEngine>

working for the same reason

cTAKES: compiled with Java 7 and 8; fresh install from the svn.

My machine: 8GB Ram, 2.7 GHz Intel Core i5 6100.

What hardware is recommended to run cTAKES as rest service?

jtgreen commented 7 years ago

I think the more germane question is: why wont the fast processor work? Thats really the standard now. Dirk - do you have the fast pipeline working with java 7?

On March 6, 2017 at 03:11:23 MST, Leander Melms notifications@github.com wrote:Hi, I'm running ctakes 3.2.3 with and without ytex. The average processing time on my machine is ~ 5 - 15 min. / medical report. Running ctakes from command line is much faster and takes about 1 - 2 min. I wondered if this is normal or if I misconfigured ctakes to run as rest server? I had to increase the max ram to 6GB with no significant changes in processing time. AE used (without ytex): ctakes-clinical-pipeline/desc/analysis_engine/AggregatePlaintextUMLSProcessor.xml. couldn't get AggregatePlaintextFastUMLSProcessor.xml working as it can't find org.apache.ctakes.dictionary.lookup2.ae.DefaultJCasTermAnnotator for some reason. AE used (with ytex): desc/ctakes-ytex-uima/desc/analysis_engine/AggregatePlaintextUMLSProcessor.xml with I couldn't get working for the same reason cTAKES: compiled with Java 7 and 8; fresh install from the svn. My machine: 8GB Ram, 2.7 GHz Intel Core i5 6100. What hardware is recommended to run cTAKES as rest service? —You are receiving this because you are subscribed to this thread.Reply to this email directly, view it on GitHub, or mute the thread. {"api_version":"1.0","publisher":{"api_key":"05dde50f1d1a384dd78767c55493e4bb","name":"GitHub"},"entity":{"external_key":"github/dirkweissenborn/ctakes-server","title":"dirkweissenborn/ctakes-server","subtitle":"GitHub repository","main_image_url":"https://cloud.githubusercontent.com/assets/143418/17495839/a5054eac-5d88-11e6-95fc-7290892c7bb5.png","avatar_image_url":"https://cloud.githubusercontent.com/assets/143418/15842166/7c72db34-2c0b-11e6-9aed-b52498112777.png","action":{"name":"Open in GitHub","url":"https://github.com/dirkweissenborn/ctakes-server"}},"updates":{"snippets":[{"icon":"DESCRIPTION","message":"Processing time (#4)"}],"action":{"name":"View Issue","url":"https://github.com/dirkweissenborn/ctakes-server/issues/4"}}}

dirkweissenborn commented 7 years ago

I have been using this implementation a year ago and it was sufficient for my purposes. I am really not the expert for ctakes so I am not sure why it is that slow. The difference can only be explained by configuration or too little memory. Did you try chunking the input. Might be the size of the document that's problematic. The server itself is really just a simple wrapper for ctakes that converts the CAS to JSON, no magic. The code for the server is very easy and can be found within a single file, so feel free to fiddle around with it and optimize.

I did only use the desc of the readme, so I don't know what else will work.

If you have optimizations or fixes please feel free to contribute.

On Mar 6, 2017 12:13, "John T Green" notifications@github.com wrote:

I think the more germane question is: why wont the fast processor work? Thats really the standard now. Dirk - do you have the fast pipeline working with java 7?

On March 6, 2017 at 03:11:23 MST, Leander Melms notifications@github.com wrote:Hi, I'm running ctakes 3.2.3 with and without ytex. The average processing time on my machine is ~ 5 - 15 min. / medical report. Running ctakes from command line is much faster and takes about 1 - 2 min. I wondered if this is normal or if I misconfigured ctakes to run as rest server? I had to increase the max ram to 6GB with no significant changes in processing time. AE used (without ytex): ctakes-clinical-pipeline/desc/ analysis_engine/AggregatePlaintextUMLSProcessor.xml. couldn't get AggregatePlaintextFastUMLSProcessor.xml working as it can't find org.apache.ctakes.dictionary.lookup2.ae.DefaultJCasTermAnnotator for some reason. AE used (with ytex): desc/ctakes-ytex-uima/desc/analysis_engine/ AggregatePlaintextUMLSProcessor.xml with

I couldn't get working for the same reason cTAKES: compiled with Java 7 and 8; fresh install from the svn. My machine: 8GB Ram, 2.7 GHz Intel Core i5 6100. What hardware is recommended to run cTAKES as rest service? —You are receiving this because you are subscribed to this thread.Reply to this email directly, view it on GitHub, or mute the thread. {"api_version":"1.0","publisher":{"api_key":" 05dde50f1d1a384dd78767c55493e4bb","name":"GitHub"},"entity": {"external_key":"github/dirkweissenborn/ctakes-server" ,"title":"dirkweissenborn/ctakes-server","subtitle":"GitHub repository","main_image_url":"https://cloud.githubusercontent.com/assets/ 143418/17495839/a5054eac-5d88-11e6-95fc-7290892c7bb5.png"," avatar_image_url":"https://cloud.githubusercontent.com/ assets/143418/15842166/7c72db34-2c0b-11e6-9aed-b52498112777.png ","action":{"name":"Open in GitHub","url":"https://github. com/dirkweissenborn/ctakes-server"}},"updates":{"snippets":[{"icon":" DESCRIPTION","message":"Processing time (#4)"}],"action":{"name":"View Issue","url":"https://github.com/dirkweissenborn/ctakes-server/issues/4"}}} — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub , or mute the thread .
balaabhinav commented 7 years ago

Hi @leandermelms I am facing the exact same problem with ctakes version 4.x Did you find a workaround by any chance?