Open maciejbiesek opened 4 years ago
There is an option to train existing models further on data using spaCy cli train
command. Just provide the name, or link to the model as the argument of the --base-model
parameter. You will need to convert your data to JSON format using convert
command.
This should work, with the exception of POS tagger for morfeusz-based version, which is not a spaCy component.
So, to sum up, we can tune eg. the NER model and POS tagger in the simplest form, but we cannot bias models that are morfeusz-based to our specific data?
You can tune NER in the morfeusz version, but you cannot do so for its POS tagger.
If it is the morfeusz tokenization that you're after, I suppose you could retrain the basic tagger, and then use it as a component in the pipeline.
Adding the ability to retrain the morfeusz-version tagger, would require more work, but we will consider this.
Ok, I see, thank you :)
Is there any option to tune the models (NER, POS) you provided on own corpora?