egerber / spaCy-entity-linker

spaCy module for linking text to Wikidata items
MIT License
215 stars 32 forks source link

Add check for empty list of roots #9

Closed jonwiggins closed 1 year ago

jonwiggins commented 2 years ago

Currently, any document containing a sentence that spacy believes to be empty will cause an exception to be raised due to unsafe indexing into a list which may be empty. For example:

nlp = spacy.load("en_core_web_md")
nlp.add_pipe("entityLinker", last=True)

nlp("\n\n")

raises

.../spacy_entity_linker/TermCandidateExtractor.py in _get_candidates_in_sent(self, sent, doc)
     12 
     13     def _get_candidates_in_sent(self, sent, doc):
---> 14         root = list(filter(lambda token: token.dep_ == "ROOT", sent))[0]
     15 
     16         excluded_children = []

IndexError: list index out of range

This PR fixes that.

jonwiggins commented 2 years ago

@egerber Any thoughts?

MartinoMensio commented 1 year ago

Hi @jonwiggins , Yes, this is a great PR! Thank you for contributing and fixing this error. I just notice this PR because of a recent comment in #10 .

Martino