explosion / spacy-stanza

💥 Use the latest Stanza (StanfordNLP) research models directly in spaCy
MIT License
723 stars 59 forks source link

Employ user hooks to map Stanford's pretrained word embeddings to Spacy's token vectors #7

Closed buhrmann closed 5 years ago

buhrmann commented 5 years ago

Since StanfordNLP's pretrained word embeddings are already loaded into memory (I think), this seems to be the easiest way to acccess them. Another option, I guess, would be to load them as external vectors into Spacy's vocabulary, but not sure if there would be any advantage, given that it would also duplicate memory.

Note, PR also updates the Spacy dependency from 2.1.0 alpha (nightly) to 2.1.0 stable version.

ines commented 5 years ago

Oh wow, thanks, this is great! 👍 Once this is merged, I'll push another release – can't believe I missed the spaCy dependency update.

buhrmann commented 5 years ago

I think this should be sufficiently robust now?

ines commented 5 years ago

Looks great, thanks!