aboSamoor / polyglot

Multilingual text (NLP) processing toolkit
http://polyglot-nlp.com
Other
2.31k stars 337 forks source link

polyglot.text.BaseBlob.language should accept a Language object #72

Open alexgarel opened 8 years ago

alexgarel commented 8 years ago
>>> import polyglot
>>> from polyglot.text import Sentence
>>> test = Sentence("This is a test")
>>> test.language = "en"
>>> another = Sentence("This is another test")
>>> another.language = test.language
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.5/dist-packages/polyglot/text.py", line 59, in language
    self.__lang = Language.from_code(value)
  File "/usr/local/lib/python3.5/dist-packages/polyglot/detect/base.py", line 47, in from_code
    return Language(("", code, 100, 0))
  File "/usr/local/lib/python3.5/dist-packages/polyglot/detect/base.py", line 28, in __init__
    self.locale = Locale(code)
icu.InvalidArgsError: (<class 'icu.Locale'>, '__init__', (<polyglot.detect.base.Language object at 0x7f35a67488d0>,))

While setting the language with the code is a good idea, not accepting a Language object is counter intuitive.

This should be trivial to implements using isinstance(value, Language).

Running polyglot 16.07.04 with python3.5.1 on ubuntu 16.04