shon-otmazgin / fastcoref

MIT License
147 stars 25 forks source link

fix: filter out non-valid spans #44

Open Pablito2020 opened 1 year ago

Pablito2020 commented 1 year ago

The commit 59f99b8493a90a7ce60e9163b212b5a7b97cb225 fixed the issue but only with the strings representation call. If we call this function without the string representation (like for example using the pipe() method from the fastcoref spacy implementation) it will cause a TypeError.

You can verify this happens applying the pipe operation with the resolve_text extension:

texts = [
    "Love shines through this great illustrated kids’ book . Read how a little girl makes chores fun and easy to do. A  fantastic addition to your little one’s free bed time story collection.",
    "Love shines through this great illustrated kids’ book . Read how a little girl makes chores fun and easy to do. A  fantastic addition to your little one’s free bed time story collection."
]

nlp = spacy.load("en_core_web_sm")
nlp.add_pipe("fastcoref", config={'model_architecture': 'FCoref', 'device': 'cpu', 'enable_progress_bar': False})
docs = nlp_fcoref.pipe(texts, component_cfg={"fastcoref": {"resolve_text": True}})
for doc in docs: # Ups! TypeError on generator here...
    pass

This commits adds the checking for None spans