explosion / spacy-stanza

💥 Use the latest Stanza (StanfordNLP) research models directly in spaCy
MIT License
723 stars 59 forks source link

Can't use the tokenizer only #37

Closed goxdve closed 4 years ago

goxdve commented 4 years ago

Hi, thank you for your great work, it's very helpful.

I encountered a problem. When using spaCy, if I just need to tokenize a sentence without other lexical features, I can use nlp.tokenizer to reduce time for other pipes.

When using spacy-stanza, I tried to do in this way and it seems that the entire pipeline is still working. 图片

Furthermore, I tried to print nlp.pipeline and it is an empty list, so I can't remove pipes. 图片

This problem is quite confusing to me, I hope to solve it and look forward to your reply.

goxdve commented 4 years ago

I already know what to do. I checked the documentation of stanza, when we consturct snlp (a stanza.pipeline object), we can specify what component to use with processors argument:

import stanza
snlp = stanza.Pipeline(lang="en", processors="tokenize")