allenai / scispacy

A full spaCy pipeline and models for scientific/biomedical documents.
https://allenai.github.io/scispacy/
Apache License 2.0
1.66k stars 223 forks source link

en_core_si_lg: dimensionality of word vectors is 300? #491

Closed barebra closed 11 months ago

barebra commented 11 months ago

I'm just asking because I encountered a ValueError when pretraining the spacy tok2vec-layer of the textcat-component with the PretrainVectors-objective and default values (see https://spacy.io/usage/embeddings-transformers#pretraining-objectives).

barebra commented 11 months ago

Okay, it's 200 not 300 like en_core_web_lg.

barebra commented 11 months ago

That's it.