UKPLab / EasyNMT

Easy to use, state-of-the-art Neural Machine Translation for 100+ languages
Apache License 2.0
1.16k stars 115 forks source link

Exception: 404 Client Error: Not Found for url: https://huggingface.co/api/models/Helsinki-NLP/opus-mt-ro-en #48

Open yen-tran-yum opened 3 years ago

yen-tran-yum commented 3 years ago

Hi,

I'm using EasyNMT for translating customer reviews. During translation, I got this error HTTPError: 404 Client Error: Not Found for url: https://huggingface.co/api/models/Helsinki-NLP/opus-mt-ro-en `HTTPError Traceback (most recent call last)

in 1 for index, row in df_review['AnswerValue'].iteritems(): ----> 2 translated_row = model.translate(row, target_lang='en')#translating each row 3 df_review.loc[index, 'Translate'] = translated_row ~/opt/anaconda3/lib/python3.8/site-packages/easynmt/EasyNMT.py in translate(self, documents, target_lang, source_lang, show_progress_bar, beam_size, batch_size, perform_sentence_splitting, paragraph_split, sentence_splitter, document_language_detection, **kwargs) 152 except Exception as e: 153 logger.warning("Exception: "+str(e)) --> 154 raise e 155 156 if is_single_doc and len(output) == 1: ~/opt/anaconda3/lib/python3.8/site-packages/easynmt/EasyNMT.py in translate(self, documents, target_lang, source_lang, show_progress_bar, beam_size, batch_size, perform_sentence_splitting, paragraph_split, sentence_splitter, document_language_detection, **kwargs) 147 method_args['documents'] = [documents[idx] for idx in ids] 148 method_args['source_lang'] = lng --> 149 translated = self.translate(**method_args) 150 for idx, translated_sentences in zip(ids, translated): 151 output[idx] = translated_sentences ~/opt/anaconda3/lib/python3.8/site-packages/easynmt/EasyNMT.py in translate(self, documents, target_lang, source_lang, show_progress_bar, beam_size, batch_size, perform_sentence_splitting, paragraph_split, sentence_splitter, document_language_detection, **kwargs) 179 #logger.info("Translate {} sentences".format(len(splitted_sentences))) 180 --> 181 translated_sentences = self.translate_sentences(splitted_sentences, target_lang=target_lang, source_lang=source_lang, show_progress_bar=show_progress_bar, beam_size=beam_size, batch_size=batch_size, **kwargs) 182 183 # Merge sentences back to documents ~/opt/anaconda3/lib/python3.8/site-packages/easynmt/EasyNMT.py in translate_sentences(self, sentences, target_lang, source_lang, show_progress_bar, beam_size, batch_size, **kwargs) 276 277 for start_idx in iterator: --> 278 output.extend(self.translator.translate_sentences(sentences_sorted[start_idx:start_idx+batch_size], source_lang=source_lang, target_lang=target_lang, beam_size=beam_size, device=self.device, **kwargs)) 279 280 #Restore original sorting of sentences ~/opt/anaconda3/lib/python3.8/site-packages/easynmt/models/OpusMT.py in translate_sentences(self, sentences, source_lang, target_lang, device, beam_size, **kwargs) 38 def translate_sentences(self, sentences: List[str], source_lang: str, target_lang: str, device: str, beam_size: int = 5, **kwargs): 39 model_name = 'Helsinki-NLP/opus-mt-{}-{}'.format(source_lang, target_lang) ---> 40 tokenizer, model = self.load_model(model_name) 41 model.to(device) 42 ~/opt/anaconda3/lib/python3.8/site-packages/easynmt/models/OpusMT.py in load_model(self, model_name) 20 else: 21 logger.info("Load model: "+model_name) ---> 22 tokenizer = MarianTokenizer.from_pretrained(model_name) 23 model = MarianMTModel.from_pretrained(model_name) 24 model.eval() ~/opt/anaconda3/lib/python3.8/site-packages/transformers/tokenization_utils_base.py in from_pretrained(cls, pretrained_model_name_or_path, *init_inputs, **kwargs) 1645 else: 1646 # At this point pretrained_model_name_or_path is either a directory or a model identifier name -> 1647 fast_tokenizer_file = get_fast_tokenizer_file( 1648 pretrained_model_name_or_path, revision=revision, use_auth_token=use_auth_token 1649 ) ~/opt/anaconda3/lib/python3.8/site-packages/transformers/tokenization_utils_base.py in get_fast_tokenizer_file(path_or_repo, revision, use_auth_token) 3406 """ 3407 # Inspect all files from the repo/folder. -> 3408 all_files = get_list_of_files(path_or_repo, revision=revision, use_auth_token=use_auth_token) 3409 tokenizer_files_map = {} 3410 for file_name in all_files: ~/opt/anaconda3/lib/python3.8/site-packages/transformers/file_utils.py in get_list_of_files(path_or_repo, revision, use_auth_token) 1691 else: 1692 token = None -> 1693 model_info = HfApi(endpoint=HUGGINGFACE_CO_RESOLVE_ENDPOINT).model_info( 1694 path_or_repo, revision=revision, token=token 1695 ) ~/opt/anaconda3/lib/python3.8/site-packages/huggingface_hub/hf_api.py in model_info(self, repo_id, revision, token) 246 ) 247 r = requests.get(path, headers=headers) --> 248 r.raise_for_status() 249 d = r.json() 250 return ModelInfo(**d) ~/opt/anaconda3/lib/python3.8/site-packages/requests/models.py in raise_for_status(self) 941 942 if http_error_msg: --> 943 raise HTTPError(http_error_msg, response=self) 944 945 def close(self): HTTPError: 404 Client Error: Not Found for url: https://huggingface.co/api/models/Helsinki-NLP/opus-mt-ro-en` Could you please review and fix the issue? Thank you.
nreimers commented 3 years ago

Hi, there is sadly no RO-EN model available from opus-mt.

Try to use the m2m models.