explosion / spacy-stanza

💥 Use the latest Stanza (StanfordNLP) research models directly in spaCy
MIT License
726 stars 60 forks source link

Tokenizer pipe() method for duck-type compatibility with Spacy Tokenizer #8

Closed buhrmann closed 5 years ago

buhrmann commented 5 years ago

Useful in case some code explicitly calls nlp.tokenizer.pipe(), as spacymoji.Emoji's constructor does for example.

ines commented 5 years ago

Thanks, this makes sense! 👍

Btw, I'm in two minds about whether we should also implement nlp.pipe. It does make sense for API consistency, but it'd also send the wrong message because it doesn't actually batch up the examples like spaCy's built-in nlp.pipe. Maybe we could just shim that method for now and raise a NotImplementedError explaining this.

In the future, it could be cool handle the batching in this wrapper – see the discussion in stanfordnlp/stanfordnlp#27.