Closed DeNeutoy closed 6 years ago
Hey,
The problem you're having is that the document is being parsed, it's just that the parser is basically useless if you don't run the tagger, because the input you're passing through at run-time is so different from the training data.
This is fixed in v2, because the parser no longer uses the POS tags as features.
We're pushing another release tonight or tomorrow, but you could already do pip install spacy-nightly
. The docs are at https://alpha.spacy.io
We'll be pushing a release candidate for v2 as soon as we get the models retrained, and we finish the rest of the tests. All the target features are now implemented, and there are currently 0 open bugs on the repository :tada:
Awesome, looking forward to v2. Spacy is π₯.
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Sentence segmentation returns a single unsegmented
Span
object consisting of the whole document if processed using aLanguage
object which did not load a POS tagger.Environment
Trying to do sentence segmentation without the parser throws an interpretable error π
These two cases silently return a single
spacy.tokens.span.Span
object consisting of the entire document.Seems like this could be fixed by simply requiring that
Doc.is_tagged == True
here. Happy to submit a PR for this, but it seems like a fix which may break stuff, so I thought i'd check here first.