explosion / spaCy

💫 Industrial-strength Natural Language Processing (NLP) in Python
https://spacy.io
MIT License
30.17k stars 4.4k forks source link

does char level features using charCNN are used for NER in spacy? #1900

Closed prashant334 closed 6 years ago

prashant334 commented 6 years ago

Python version: 2.7.6 Platform: Linux-3.16.0-77-generic-x86_64-with-Ubuntu-14.04-trusty spaCy version: 2.0.0a17 Models: en, en_core_web_sm, xx_ent_wiki_sm

during training of model. does charCNN used for capturing morphological features from characters?

def train_ner(nlp, train_data, output_dir):
    random.seed(0)
    optimizer = nlp.begin_training(lambda: [])
    nlp.meta['name'] = 'CRIME_LOCATION'
    for itn in range(50):
    losses = {}
    for batch in minibatch(get_gold_parses(nlp.make_doc, train_data), size=3):
        docs, golds = zip(*batch)
        nlp.update(docs, golds, losses=losses, sgd=optimizer, drop=0.35)
    print("under learning")
    if not output_dir:
        return
honnibal commented 6 years ago

Yes, spaCy's NER (and other models) uses subword features, although it doesn't use a character-based CNN to extract them. Instead, the word vectors are learned by concatenating embeddings of NORM, PREFIX, SUFFIX and SHAPE lexical attributes. A hidden layer is then used to allow a non-linear combination of the information in these concatenated vectors. The function for this can be found in spacy._ml.Tok2Vec.

The best reference for this embedding strategy is currently the NER algorithm video: https://www.youtube.com/watch?v=sqDHBH9IjRU

ines commented 6 years ago

To add to @honnibal's comment above, there's also a section in the API docs that describes the neural network model architecture in more detail: https://spacy.io/api/#nn-model

lock[bot] commented 6 years ago

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.