dmis-lab / BioSyn

ACL'2020: Biomedical Entity Representations with Synonym Marginalization
https://arxiv.org/abs/2005.00239
MIT License
160 stars 26 forks source link

Input/Output clarification #9

Closed amirj closed 3 years ago

amirj commented 3 years ago

If I understood correctly, we have a dictionary which maps "aliases"/"synonyms" to a list of corresponding cuis. The input of the algorithm is a string (mention) and the output is some items in the dictionary:

{
  "mention": "ataxia telangiectasia", 
  "predictions": [
    {"name": "ataxia telangiectasia", "id": "D001260|208900"}, 
    {"name": "ataxia telangiectasia syndrome", "id": "D001260|208900"}, 
    {"name": "ataxia telangiectasia variant", "id": "C566865"}, 
    {"name": "syndrome ataxia telangiectasia", "id": "D001260|208900"}, 
    {"name": "telangiectasia", "id": "D013684"}
  ]
}

Since our target is to map the mention to a CUI, I'm wondering if there is any functionality that map the above output to a single CUI?

mjeensung commented 3 years ago

Hi amirj

The predictions are sorted in the order of the final scores. This means that the first item of the predictions is the top 1 prediction, and the CUI of it is the final single CUI for the given input mention.

amirj commented 3 years ago

Thanks Mujeen. What would happen if the top prediction is corresponding to multiple CUIs? --what's the best practices to extract the target CUI among them?

mjeensung commented 3 years ago

When a single prediction has multiple CUIs, the CUIs are from multiple KBs. For example, in 'D001260|208900', 'D001260' is from MeSH and '208900' is from OMIM.

For evaluation, I consider it correct when any of the predicted CUIs is matched with the golden answer. But, in practice, you can choose any CUI depending on the KB you are using.

amirj commented 3 years ago

Thanks for your clarification. In my use case, I'm leveraging UMLS "aliases/synonyms" in my dictionary. As a result, an ambiguous synonym would be mapped to more than one CUI. What's your suggestion in this situation?

mjeensung commented 3 years ago

I think it depends on how you use it. Is there a reason that you want to extract just one CUI?

amirj commented 3 years ago

Yes, I want to directly map a mention to only one entity, i.e. entity linking

mjeensung commented 3 years ago

Hi amirj

Thank you for your patience.

That's a good point. But in our work, we focus on handling term variations of the biomedical concepts rather than disambiguating mentions.