Closed vtangutoori closed 6 years ago
It looks like you're using an outdated version of spaCy (1.0). Try upgrading to the most recent (2.0.5) and see if that clears it up. You'll also need to make sure that your data is in the format that spaCy expects.
Hi @ahalterman, Thank you for the response, i have updated the spacy version and ran the training, but it took 8 hours and still did not complete training so i shut the kernel down, do you know any resources where i can check some examples on how to tune the training process so that i can get an understanding, i am a newby to the ML and AI fields.
hey @vtangutoori i training process is fine , what goes wrong for is testing part where you are testing again the whole train data .
It's true that it's not that fast to train if you have a lot of data. To better visualize what's going on, you can change this line from
for text, annotations in TRAIN_DATA:
to
for text, annotations in tqdm(TRAIN_DATA):
to get a progress bar for each iteration. The earliest iteration is the slowest because it uses batch size 1, but at least you'll be able to see whether it's moving along and how long it may take. Don't forget to add
from tqdm import tqdm
at the top of train_ner.py
To add to @ahalterman's comment: The training examples in the examples
directory are mostly intended to be self-contained scripts that you can run and test quickly. They're not really optimised to work with large datasets – for example, they don't use batching.
So once you're getting "serious" about training your model, you might want to use the built-in spacy train
command instead – see here for the documentation. You can find the full implementation of spacy train
here: https://github.com/explosion/spaCy/blob/master/spacy/cli/train.py
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Hi, I am trying to train spacys name recognition using a training set of my own, i have used the code provided in the official website, i am a newby to python and machine learning, can someone please tell me where i am going wrong.
i have taken the below code from spacy's github page and replaced the training data and calling the function as below.
function call
main(model='en', output_dir=None, n_iter=5)
spacy code(https://github.com/explosion/spacy/blob/master/examples/training/train_ner.py)
"""Example of training spaCy's named entity recognizer, starting off with an existing model or a blank model. For more details, see the documentation:
import plac import random from pathlib import Path import spacy
training data
TRAIN_DATA =name_set
@plac.annotations( model=("en", "option", "m", str), output_dir=("C:/Python27/Python-Data-Science-and-Machine-Learning-Bootcamp/Machine Learning Sections/my work", "option", "o", Path), n_iter=(5, "option", "n", int)) def main(model=None, output_dir=None, n_iter=100): """Load the model, set up the pipeline and train the entity recognizer.""" if model is not None: nlp = spacy.load(model) # load existing spaCy model print("Loaded model '%s'" % model) else: nlp = spacy.blank('en') # create blank Language class print("Created blank 'en' model")
if name == 'main':
plac.call(main)
Your Environment
I am using python 3.X and latest version of spacy.