Open dcsan opened 4 years ago
not a bug report per se
I'm wondering how spacy/chinese models compares with the stanza project? Stanza already provides chinese support with many features https://stanfordnlp.github.io/stanza/models.html
that has a chinese (simplified) model and provides dep-parser, lemma and other basic NLP features.
I'm a bit confused as it uses spacy for tokenization: https://stanfordnlp.github.io/stanza/tokenize.html#use-spacy-for-fast-tokenization-and-sentence-segmentation
You can only use spaCy to tokenize English text for now, since spaCy tokenizer does not handle multi-word token expansion for other languages.
which would imply spacy is a lower level library, and yet they seem similar.
Hi @dcsan, to me why Stanza uses spacy for tokenization maybe just because SpaCy's tokenization for English is pretty good. I think Stanza and Spacy are both full-featured NLP frameworks.
not a bug report per se
I'm wondering how spacy/chinese models compares with the stanza project? Stanza already provides chinese support with many features https://stanfordnlp.github.io/stanza/models.html
that has a chinese (simplified) model and provides dep-parser, lemma and other basic NLP features.
I'm a bit confused as it uses spacy for tokenization: https://stanfordnlp.github.io/stanza/tokenize.html#use-spacy-for-fast-tokenization-and-sentence-segmentation
which would imply spacy is a lower level library, and yet they seem similar.