Open JavierBJ opened 4 years ago
Thanks for reporting, @JavierBJ!
In out experience, prefix splitting can cause more trouble than is worth. We're looking at the problematic Freeling rule (^sub
) to figure out a solution. In the meantime, you could try only using suffixes rules (e.g., clitics) if that fits your scenario. We use something like this in other projects:
import spacy
from spacy_affixes import AffixesMatcher
from spacy_affixes.utils import AFFIXES_SUFFIX
from spacy_affixes.utils import load_affixes
nlp = spacy.load("es")
suffixes = {k: v for k, v in load_affixes().items()
if k.startswith(AFFIXES_SUFFIX)}
affixes_matcher = AffixesMatcher(nlp, split_on=["VERB"], rules=suffixes)
nlp.add_pipe(affixes_matcher, name="affixes", before="tagger")
Thank you very much @versae for your workaround, it solved the problems mentioned. I'll keep an eye on any solutions you find on the Freeling rule issue.
Kind regards
I'm using
spacy-affixes
as part of the SpaCy pipeline, as explained in the usage guide. It has been working properly until I tried the following sentence: "Sube el paro". When doingnlp("Sube el paro.")
I'm getting the following error:From my experience and tries, I can say the bug happens with texts like:
But not with texts like:
Given the error thrown, something related to matching prefix "sub" might be messing things up.
My configuration