Open rennanvoa2 opened 3 years ago
Hi,
Since I'm new to SpaCy and Python, I'm not sure if this is the correct way to implement it. For Python 3.9 with SpaCy 3.0.3 the following worked for me:
import spacy
from spacy.language import Language
from spacy_langdetect import LanguageDetector
# Add LanguageDetector and assign it a string name
@Language.factory("language_detector")
def create_language_detector(nlp, name):
return LanguageDetector(language_detection_function=None)
# Use a blank Pipeline, also a model can be used, e.g. nlp = spacy.load("en_core_web_sm")
nlp = spacy.blank("en")
# Add sentencizer for longer text
nlp.add_pipe('sentencizer')
# Add components using their string names
nlp.add_pipe("language_detector")
# Analyze components and their attributes
text = "This is an English text."
doc = nlp(text)
# Document level language detection.
print(doc._.language)
# See what happened to the pipes
nlp.analyze_pipes(pretty=True)`
I got on this track with: Language-specific pipeline
Is this the right way to use it with SpaCy3?
How to use the result for language specific processing?
Do I have to load language specific models, e.g.
nlp_en = spacy.load("en_core_web_sm")
and
nlp_de = spacy.load("de_core_news_sm")
?
Many thanks and best regards,
Cusard
same problem
Hello everybody! Thanks to @Cusard I got the example code to work with the current spacy version.
import spacy
from spacy.language import Language
from spacy_langdetect import LanguageDetector
@Language.factory("language_detector")
def create_language_detector(nlp, name):
return LanguageDetector(language_detection_function=None)
nlp = spacy.load("en_core_web_sm")
nlp.add_pipe('language_detector')
text = 'This is an english text.'
doc = nlp(text)
# document level language detection. Think of it like average language of the document!
print(doc._.language)
# sentence level language detection
for sent in doc.sents:
print(sent, sent._.language)
The output looks like this:
{'language': 'en', 'score': 0.9999983570159962}
This is an english text. {'language': 'en', 'score': 0.9999956329695125}
Thanks for sharing the solution. It worked for me too.
It will be nice if the project home page had the example update: https://spacy.io/universe/project/spacy-langdetect
The example provided by @FelixSiegfriedRiedel works for me with v3.3.
I've also raised an issue about updating the documentation: https://github.com/explosion/spaCy/issues/11038
Hello guys, With the V3 update when I run the example code it complains:
I figured out that now we have to pass the string name, to
nlp.add_pipe
but how?I've tried
nlp.add_pipe("langdetect")
,nlp.add_pipe("LanguageDetector")
,nlp.add_pipe("languagedetector")
and none of them seems to work.Can you help me with this ?