cltk / cltkv1

Experimental repo for new API CLTK
MIT License
1 stars 5 forks source link

Fix stanza download helper #69

Closed kylepjohnson closed 4 years ago

kylepjohnson commented 4 years ago

This when models have not been downloaded.


In [1]: from cltkv1 import NLP                                                  

In [2]: from cltkv1.languages.example_texts import get_example_text             

In [3]: cltk_nlp = NLP(language="lat")                                          

In [4]: cltk_doc = cltk_nlp.analyze(text=get_example_text("lat"))               

ΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑΑ

CLTK message: The part of the CLTK that you are using depends upon the Stanza NLP library (`stanza`). What follows are several question prompts coming from it. (More at: <https://github.com/stanfordnlp/stanza>.) Answer with defaults.

ΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩΩ

Downloading https://raw.githubusercontent.com/stanfordnlp/stanza-resources/master/resources_1.0.0.json: 115kB [00:00, 4.32MB/s]
---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
<ipython-input-4-6d0ff115552e> in <module>
----> 1 cltk_doc = cltk_nlp.analyze(text=get_example_text("lat"))

~/cltkv1/src/cltkv1/nlp.py in analyze(self, text)
    112         for process in self.pipeline.processes:
    113             a_process = process(input_doc=doc, language=self.language.iso_639_3_code)
--> 114             a_process.run()
    115             doc = a_process.output_doc
    116 

~/cltkv1/src/cltkv1/dependency/processes.py in run(self)
     42     def run(self):
     43         tmp_doc = self.input_doc
---> 44         stanza_wrapper = self.algorithm
     45         stanza_doc = stanza_wrapper.parse(tmp_doc.raw)
     46         cltk_words = self.stanza_to_cltk_word_type(stanza_doc)

~/.pyenv/versions/3.7.7/envs/cltkv137/lib/python3.7/site-packages/boltons/cacheutils.py in __get__(self, obj, objtype)
    608         if obj is None:
    609             return self
--> 610         value = obj.__dict__[self.func.__name__] = self.func(obj)
    611         return value
    612 

~/cltkv1/src/cltkv1/dependency/processes.py in algorithm(self)
     38     @cachedproperty
     39     def algorithm(self):
---> 40         return StanzaWrapper.get_nlp(language=self.language)
     41 
     42     def run(self):

~/cltkv1/src/cltkv1/dependency/stanza.py in get_nlp(cls, language, treebank)
    353             return cls.nlps[language]
    354         else:
--> 355             nlp = cls(language, treebank)
    356             cls.nlps[language] = nlp
    357             return nlp

~/cltkv1/src/cltkv1/dependency/stanza.py in __init__(self, language, treebank, stanza_debug_level)
    124         if not self._is_model_present():
    125             # download model if necessary
--> 126             self._download_model()
    127 
    128         # instantiate actual stanza class

~/cltkv1/src/cltkv1/dependency/stanza.py in _download_model(self)
    276         print("")  # pragma: no cover
    277         print("")  # pragma: no cover
--> 278         stanza.download(lang=self.language, package=self.treebank)
    279         # if file model still not available after attempted DL, then raise error
    280         if not file_exists(self.model_path):

~/.pyenv/versions/3.7.7/envs/cltkv137/lib/python3.7/site-packages/stanza/utils/resources.py in download(lang, dir, package, processors, logging_level, verbose)
    224     resources = json.load(open(os.path.join(dir, 'resources.json')))
    225     if lang not in resources:
--> 226         raise Exception(f'Unsupported language: {lang}.')
    227     if 'alias' in resources[lang]:
    228         logger.info(f'"{lang}" is an alias for "{resources[lang]["alias"]}"')

Exception: Unsupported language: lat.