Closed h0gar closed 9 months ago
In spacy-stanza
, the whole stanza processing runs as part of the tokenizer step, which is run before any pipeline components.
I think you can provide pretokenized and sentence-per-line text to stanza as described here with additional options: https://github.com/explosion/spacy-stanza/#stanza-pipeline-options
All the options are passed through, so see if anything in their docs looks like what you want: https://stanfordnlp.github.io/stanza/tokenize.html
Please feel free to reopen if you're still running into issues!
Hi,
With Spacy, I would normally do this to use a custom sentencizer.
But if I do that with spacy-stanza, I get the following error:
Although "first=True" should make this pipe run before the document is parsed.
Is it possible to use a custom segmentation with spacy-stanza?