facebookresearch / Clinical-Trial-Parser

Library for converting clinical trial eligibility criteria to a machine-readable format.
Apache License 2.0
163 stars 58 forks source link

MeSH concepts not available: NEL Fails #6

Closed DustinWatson1023 closed 4 years ago

DustinWatson1023 commented 4 years ago

ie_parse.sh is failing on the NEL step. I checked and the file definitely does not exist. Is this generated externally from this script? Error below.


Extract inclusion and exclusion criteria...
I0522 17:31:28.854284    8421 main.go:101] Ingested studies: 20
I0522 17:31:28.867159    8421 main.go:133] 00:00:00.013
Run NER on extracted criteria...
WARNING:root:This caffe2 python run does not have GPU support. Will run in CPU only mode.
[nltk_data] Downloading package punkt to /home/ubuntu/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
Run NEL to map NER terms to MESH concepts...
I0522 17:31:31.423380    8469 conf.go:72] src/resources/config/nel.conf: Lines: 21, Parameters: 10
I0522 17:31:31.423450    8469 main.go:150] Loading MeSH ...
F0522 17:31:31.423474    8469 mesh.go:97] open data/mesh/desc2020.xml: no such file or directory
exit status 255
NEL failed.
salkola commented 4 years ago

You need a MeSH vocabulary, which is not included in the repo. It can be downloaded using mesh.sh.

Note that word_embeddings.vec.gz needs to be unzipped. It is located in data/embedding. A NER model does not need to be trained because a pre-trained NER binary is included.