ncbi-nlp / NegBio

:newspaper: High-performance tool for negation and uncertainty detection in radiology reports
Other
157 stars 42 forks source link

How to write patterns based on other medical data #42

Closed kaushikepi closed 4 years ago

kaushikepi commented 4 years ago

@kaushikacharya How are the pre, post and neg patterns have been written and how one can write it's own pattern based on some other data.?

kaushikacharya commented 4 years ago

You would need to read the paper: https://arxiv.org/abs/1712.05898 to understand how the negation patterns are identified in the text.

They search for patterns in the dependency graph. One advantage of this over typically regex pattern search is that it can span over large sentences without showing up too many false positives. (Negex is the algorithm that utilizes regular expression. MetaMap have used a variant of this negex algorithm).

Following are the examples from the NegBio paper: image

Let's take the center example. This pattern is identified by https://github.com/ncbi-nlp/NegBio/blob/master/negbio/patterns/neg_patterns.txt#L26

# no evidence of|for XXX
{} <{dependency:/nmod:of|nmod:for/} ({lemma:/evidence/} >{dependency:/neg/} {})

In terms of source code, have a look at match_neg() function in https://github.com/ncbi-nlp/NegBio/blob/master/negbio/neg/neg_detector.py#L59

for pattern in self.neg_patterns:
            for m in pattern.finditer(graph):
                n0 = m.group(0)

This part of the code checks for the presence of patterns mentioned in neg_patterns.txt

If you want to add your custom patterns, neg_patterns.txt is the place where it should be done.

kaushikepi commented 4 years ago

@kaushikacharya Thank you for the explanation. One more thing I want to ask that Does NegBio use the StandfordCoreNLP Java toolkit?

yfpeng commented 4 years ago

Yes. NegBio uses StandfordCoreNLP to covert a parsing tree to a dependency graph.

kaushikepi commented 4 years ago

@yfpeng So to use NegBio in any proprietary software we need to get the license. is it?

yfpeng commented 4 years ago

NegBio is in public domain. For StandfordCoreNLP, please contact them.

kaushikepi commented 4 years ago

@yfpeng @kaushikacharya Thanks for solving all the queries. Really appreciate it.