allenai / scispacy

A full spaCy pipeline and models for scientific/biomedical documents.
https://allenai.github.io/scispacy/
Apache License 2.0
1.72k stars 229 forks source link

Update RxNorm Entity Linker Data #522

Closed ulc0 closed 1 month ago

ulc0 commented 3 months ago

Greetings SciSpacy fans!

We have almost successfully run a UMLS update (nmslib issues and all!) in an Azure Databricks Python 3.11 environment.

The code says the process is only for UMLS and MESH.

How can we refresh RxNorm? Is it some sort of hidden parameter?

Thanks

-Kate CDC Data Hub

dakinggg commented 2 months ago

I believe the export_umls_json.py script is used to produce all of the linker variants, with the --source argument. Is that not working?

ulc0 commented 2 months ago

I believe the export_umls_json.py script is used to produce all of the linker variants, with the --source argument. Is that not working?

I'll try again, must be syntax

Thanks!

ulc0 commented 2 months ago

umls_util/read_umls_concepts seems to be hardcoded to the UMLS metathesaurus, not the RXNORM. I can probably create a "read_rxnorm_concepts" by reverse engineering the published linker data

ulc0 commented 1 month ago

I believe the export_umls_json.py script is used to produce all of the linker variants, with the --source argument. Is that not working?

So the UMLS linker is a superset of the RXNorm linker? So in theory I do not have to refresh the rxnorm linker if I have refreshed the umls linker?