higgood / med-jargon-explain-inator

Forking this so that we can associate tasks with the relevant repo. The ownership of this project belongs to all team members, and not to HIGG. HIGG is only sponsoring to facilitate project management.
2 stars 1 forks source link

added cleaned umls term and def files #18

Closed ecm68 closed 6 months ago

ecm68 commented 6 months ago

Added:

This version uses UMLS data only and includes concepts from only 31 of the 127 semantic groupings. The term set has been further cleaned to remove terms containing certain punctuation (, ; @). Any term longer than 5 tokens, split on whitespace, was also removed.

Still contains terms in term_to_cui.json that don't have corresponding definitions in cui_to_def.json. 211,595 terms are defined, corresponding to 48,143 concepts, each with a single definition set in cui_to_def.json.