gmichalo / UmlsBERT

MIT License
94 stars 16 forks source link

How to use pretrained UmlsBERT to get embeddings for all the UMLS terms? #3

Open AneryPatel opened 3 years ago

AneryPatel commented 3 years ago

I want to load the pretrained UmlsBERT model to generate vector representations/embeddings of all the medical terms in UMLS. Which library to use to load this model since it is a modified version of BERT? Also, which layer or combination of layers provides the best representation of the vectors?

gmichalo commented 3 years ago

Thank you for your interest in UmlsBert.

For the creation of UmlsBert, we used the pytorch and the huggingface library.

However, at the present time, in order to use UmlsBERT architecture, you will need to use the code that we provide in the Github repo.

As for the pre-trained weights of UmlsBERT we provide a link in the README where you can download them.

From our experiments, we observed that the best input embeddings were created when we use the standard BERT embedding (Token, Segment, Position) and the Semantic Type Embedding that we introduce in this architecture

HodaMemar commented 2 years ago

Hello I want to work with UmlsBERT on the huggingface. there are two UmlsBert on the huggingface:

https://huggingface.co/GanjinZero/UMLSBert_ENG , and cambridgeltl/SapBERT-UMLS-2020AB-all-lang-from-XLMR

I am a beginner in this area. How can I access the embedding of, for example, one concept selected from UMLS after executing this line:

model = AutoModel.from_pretrained("GanjinZero/UMLSBert_ENG")

I will be thankful if you help me with this issue.

GanjinZero commented 2 years ago

Hello I want to work with UmlsBERT on the huggingface. there are two UmlsBert on the huggingface:

https://huggingface.co/GanjinZero/UMLSBert_ENG , and cambridgeltl/SapBERT-UMLS-2020AB-all-lang-from-XLMR

I am a beginner in this area. How can I access the embedding of, for example, one concept selected from UMLS after executing this line:

model = AutoModel.from_pretrained("GanjinZero/UMLSBert_ENG")

I will be thankful if you help me with this issue.

GanjinZero/UMLSBert_ENG refers to https://github.com/GanjinZero/CODER, it has the same name with this repo accidently.

cambridgeltl/SapBERT-UMLS-2020AB-all-lang-from-XLMR is also another model refers to SapBERT.