Closed KIRNESH closed 4 years ago
Hi, Do you have any updates on this?
Hi @KIRNESH, apologies for the late follow-up. In general it's probably better to ask more generic questions on StackOverflow, where there is a larger community. It also helps us to keep this tracker focused on bug reports and feature requests.
In general, your process looks fine. The spaCy pretrain
command will basically learn from the vectors you provided (from en_core_web_md
) and use the result as the internal Tok2Vec
layer. You need to start from a blank model for this (like you do in the code snippet), because you can't just change the underlying Tok2Vec
layer and expect other parts of the pretrained components to still work correctly.
In general it's virtually impossible to say what the loss should look like - it really depends on the size of your datasets and hyperparameters you're using. You want to monitor the loss across training iterations: if it stops decreasing significantly, the training process has hit its limits. You can definitely experiment with different hyperparameters for your model and training loops, but ideally you'd evaluate the performance of those on a downstream task, e.g. some NER challenge where you evaluate accuracy on a hold-out test set. That will give you a realistic idea on how the pretraining helps (or doesn't!). See also this blog post for more background information, and this user blog post for an example.
Hope that helps !
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
I am training a NER system related to a specific data set. Since the data is very badly and less annotated, I tried using "spacy pretrain" cli command to generate vectors.
But I want to use it in language model training as in nlp.update() command. I followed the code described in this GitHub link https://github.com/explosion/spaCy/issues/3448 The code looks like this
My questions are: