tmills / ctakes-docker

Apache License 2.0
23 stars 18 forks source link

Question about adding other dictionaries #16

Closed MatthewVita closed 6 years ago

MatthewVita commented 6 years ago

Hi Tim,

Hopefully this is an easy one for you.

Per the discussion I had with Sean in the cTAKES mailing list, I followed up by created a YouTube video on creating ICD10 dictionaries for cTAKES: https://www.youtube.com/watch?v=4aOnafv-NQs

What is not present in the video is how to wire up the cTAKES configuration to actually use the dictionary (which I assumed would be the easy part 😄).

The following code changes don't apply the dictionary and I just can't comprehend the LookupXml piper file documentation on the wiki.

RUN mkdir apache-ctakes-4.0.0/resources/org/apache/ctakes/dictionary/lookup/fast/icd10
COPY icd10 apache-ctakes-4.0.0/resources/org/apache/ctakes/dictionary/lookup/fast/icd10
COPY icd10.xml apache-ctakes-4.0.0/resources/org/apache/ctakes/dictionary/lookup/fast/

Thoughts?

As always, I will be sure to document the solution either on a public YouTube or in a wiki somewhere.

tmills commented 6 years ago

I think for our purposes the piper documentation isn't relevant. We're using old school UIMA xml descriptor files. Sean wrote his own pipeline descriptor file format called Piper which is much easier to read but I wanted to stick with vanilla UIMA. So those commands look correct for putting the dictionary and the lookup descriptor in the right place. Then you need to edit the dictionary descriptor to see the lookup descriptor: /apache-ctakes-4.0.0/desc/ctakes-dictionary-lookup-fast/desc/analysis_engine/UmlsLookupAnnotator.xml

find the section called "DictionaryDescriptor" and replace the string value there with yours: <string>org/apache/ctakes/dictionary/lookup/fast/sno_rx_16ab.xml</string> => <string>org/apache/ctakes/dictionary/lookup/fast/icd10.xml</string>

MatthewVita commented 6 years ago

Awesome. I will try this and document it

MatthewVita commented 6 years ago

@tmills see my comment over at https://github.com/tmills/ctakes-docker/pull/17