Instructions for running cTAKES rest server to get CUIs for documents.
Install Docker (and make sure it's running correctly)
Create a UMLS account: https://uts.nlm.nih.gov/license.html
Run the command to create the docker container using the Dockerfile in this directory:
docker build -t ctakes-web-rest .
Create environment variables for your UMLS account:
export umls_api_key=<your umls api key>
Start the docker container:
./start_rest.sh
This will take quite a while to start up. If you get the docker container id with docker ps
you can check the progress of this container startup with docker logs <container id>
. You only need to specify the first few characters of the container id. The container is ready when the final line of the log has the following: org.apache.catalina.startup.Catalina.start Server startup in [25,329] milliseconds.
Run the script sample_extract_cuis.py, which demonstrates how to use the functions in ctakes_rest.py to extract CUIs using the REST server. Take a look at output.txt to make sure it ran correctly -- on each line it should print out a filename followed by a space-delimited list of CUIs.
python sample_extract_cuis.py fake_notes/ output.txt