GateNLP / python-gatenlp

Python text processing, pattern matching, and NLP framework
https://gatenlp.github.io/python-gatenlp/
Apache License 2.0
63 stars 8 forks source link

Getters fail to fail silently when no name match found #203

Open nicksunderland opened 1 year ago

nicksunderland commented 1 year ago

Describe the bug Getters don't fail silently if named group doesn't exist because _get_match returns None and then .get() is called on None: https://github.com/GateNLP/python-gatenlp/blob/29f5390ca850143f02cfefac3b635835067eae3b/gatenlp/pam/pampac/getters.py#L230

To Reproduce

# Add annotation using GetText helper to get text of a POSSIBLE subcomponent of pattern
#text = """foo bar"""  # scenario 1, works
text = """foo not_bar"""  # scenario 2, fails
doc1 = Document(text)
tok1 = NLTKTokenizer(nltk_tokenizer=WhitespaceTokenizer())
doc1 = tok1(doc1)
print("---------")
for ann in doc1.annset():
    print(doc1[ann].ljust(4, " ") + " - " + str(ann))

pat1 = Seq(AnnAt(text="foo", name="foo_ann"),
           AnnAt(text="bar", name="bar_ann").repeat(0, 1), name="the_whole_pattern")
act1 = AddAnn(name="the_whole_pattern",
              type="FOO_THEN_MAYBE_BAR",
              features={"sub_pattern_feature": GetText(name="bar_ann", silent_fail=True)})
rule = Rule(pat1, act1)
pamp = Pampac(rule, skip="longest", select="first")
annt = PampacAnnotator(pamp, annspec=[("", "Token")], outset_name="")
annt(doc1)

print("----Ok if name 'bar_ann' exists-----")
for ann in doc1.annset(""):
    print(doc1[ann] + " - " + str(ann))

Error:

  File "/Users/nicholassunderland/Library/Caches/pypoetry/virtualenvs/reportextractorpy-aB6e8YJm-py3.11/lib/python3.11/site-packages/gatenlp/pam/pampac/getters.py", line 230, in __call__
    span = match.get("span")
           ^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'get'

Need to handle match being None in all of the Getter calls:

    def __call__(self, succ, context=None, location=None):
        match = _get_match(
            succ, self.name, self.resultidx, self.matchidx, self.silent_fail
        )
        ann = match.get("ann")
        if ann is None:
            if not self.silent_fail:
                raise Exception(
                    f"No annotation found for name {self.name}, {self.resultidx}, {self.matchidx}"
                )
        return ann