explosion / spaCy

💫 Industrial-strength Natural Language Processing (NLP) in Python
https://spacy.io
MIT License
29.83k stars 4.38k forks source link

NER component in en_core_web_trf doesn't depend on transformer #13325

Closed frankier closed 8 months ago

frankier commented 8 months ago

How to reproduce the behaviour

I just wanted entities so I thought I would only enable NER in case it goes a bit faster.

import spacy
nlp = spacy.load("en_core_web_trf", enable=["ner"])
results = nlp("I went to France for a coffee with Francois")
for ent in results.ents:
    print(ent.text, ent.label_)

It looks like the outputs are just that subsequent bigrams is ORDINAL:

I went ORDINAL
to France ORDINAL
for a ORDINAL
coffee with ORDINAL

The problem goes away when I enable transformer:

import spacy
nlp = spacy.load("en_core_web_trf", enable=["ner", "transformer"])
results = nlp("I went to France for a coffee with Francois")
for ent in results.ents:
    print(ent.text, ent.label_)

Output:

France GPE
Francois PERSON

I suppose ner should depend upon transformer.

Your Environment

svlandeg commented 8 months ago

Hi! Let me transfer this thread to the discussion forum and follow up there.