Open hg2051 opened 4 years ago
There is some overhead in aligning the annotation and creating the spacy Doc
, although I wouldn't have expected it to be that significant vs. the stanza processing time.
The main difference may be batching, though. stanza
doesn't have any native support for batching (they just suggest concatenating docs with \n\n
) so the spacy-stanza
wrapper processes each text individually, even with nlp.pipe
, since we also want to be able to process docs that contain \n\n
without problems.
Can you provide more details about how you're using spacy-stanza
?
Spacy Stanza is much slower than merely Stanza