IHTSDO / snowstorm

Scalable SNOMED CT Terminology Server using Elasticsearch
Other
204 stars 80 forks source link

German terms / Regular Expression / Fuzzy search #160

Open MahboobehJannesari opened 4 years ago

MahboobehJannesari commented 4 years ago

Hi

I work on German data, but there is no German snomed ct version.

One solution is search a concept using regular expression or fuzzy search and find a match for the German e.g. "Staphylokokken" to the english "Staphylococcus"

1- I would like to know how can search a concept using regular expression or fuzzy search in snowstorm?

Any one has experience work on German terms? 2-Is there another solution to search in snowstrom to get result without translation German word to English.

kaicode commented 4 years ago

Hi @MahboobehJannesari,

I'm also keen to hear from others about their experience with German terms but I can say something about what Snowstorm can do now..

So far we have chosen not to implement fuzzy search in Snowstorm. This is because if a user has gone to the effort of typing out a complete specific term and get back results which look the same at first glance then the wrong concept may be selected. I realise this may not apply in an NLP scenario so maybe there is room for improvement.

Instead of fuzzy search Snowstorm has multiple prefix any order search. So using a single word prefix search term "staphyl" will match "Staphylococcus", and multiple prefixes like search term "staph tox" will match "Staphylococcus toxin". All concept synonyms are used to match results. The results returned are sorted by matching description length ascending so closest match should be at or near the top of the list and concepts with more nouns after that. The FSN or preferred term of the top concept may be longer than those lower down in the results but their will be a short acceptable description on that concept which actually matched.

There is a regex search mode in the descriptions endpoint. For example: browser/MAIN/2020-07-31/descriptions?searchMode=REGEX&term=Staphylo.o..us&active=true&conceptActive=true But I'm not sure how much that will help you.

If you would like to search using German medical terms to match English descriptions you could try using an API like Google Translate to translate the search term first but be careful. Translation of medical terms is not a simple thing to do and translation services often get it wrong. People have tried using automatic translation of medical terms when creating a SNOMED CT translation extension that is only ever used as a first pass or suggestion before human review.

Kind regards, Kai

kaicode commented 4 years ago

I hope you don't mind but I've updated the title to help other people find this discussion 😄

MahboobehJannesari commented 4 years ago

thanks for your comments and update.

kaicode commented 4 years ago

There is an open feature suggestion here which is related to this topic #153.