explosion / spaCy

💫 Industrial-strength Natural Language Processing (NLP) in Python
https://spacy.io
MIT License
30.15k stars 4.4k forks source link

how to train customised ner model for multiple named entities on same training data in a single code? #2440

Closed prashant334 closed 6 years ago

prashant334 commented 6 years ago

@honnibal @ines

Info about spaCy

Python version     2.7.12         
Platform           Linux-4.13.0-43-generic-i686-with-Ubuntu-16.04-xenial
spaCy version      2.0.11         
Location           /usr/local/lib/python2.7/dist-packages/spacy
Models             en_core_web_sm 

Hi , I have trained one customized trained NER for crime location extraction with the help of this _https://github.com/explosion/spacy/blob/master/examples/training/train_new_entity_type.py_. But here we are training only for only crime location entity.

let's take an example suppose if we have crime articles for training and we are looking to extract crime location , crime type, name of victim and weapon for crime by using same training data in a single code rather than having different training models for different customized named entity.

ines commented 6 years ago

When you train the model, you can always provide more than one label, and also use several labels in the training data. First, you can add the list of labels to the NER model:

labels = ['CRIME_LOC', 'CRIME_TYPE', 'VICTIM', 'WEAPON']
for label in labels:
    ner.add_label(label)

And your training data could then look like this:

TRAIN_DATA = [
    (
        "John Doe was shot with a rifle in East London", 
        {'entities': [(0, 8, 'VICTIM'), (25, 30, 'WEAPON'), (34, 45, 'CRIME_LOC')]}
    )
]
lock[bot] commented 6 years ago

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.