benman1 / generative_ai_with_langchain

Build large language model (LLM) apps with Python, ChatGPT and other models. This is the companion repository for the book on generative AI with LangChain.
MIT License
552 stars 219 forks source link

chapter 3 - Using Hugging Face #39

Closed amscosta closed 1 week ago

amscosta commented 3 months ago

Hi, The following code snippet from Chapter 3 : from langchain.llms import HuggingFaceHub llm = HuggingFaceHub( model_kwargs={"temperature": 0.5, "max_length": 64}, repo_id="google/flan-t5-xxl" ) prompt = "In which country is Tokyo?" completion = llm(prompt) print(completion)

Is giving:

ValueError Traceback (most recent call last) Cell In[6], line 2 1 prompt = "In which country is Tokyo?" ----> 2 completion = llm(prompt) 3 print(completion)

File D:\generative_ai_with_langchain\pyenv\Lib\site-packages\langchain\llms\, in, prompt, stop, callbacks, tags, metadata, kwargs) 818 if not isinstance(prompt, str): 819 raise ValueError( 820 "Argument prompt is expected to be a string. Instead found " 821 f"{type(prompt)}. If you want to run the LLM on multiple prompts, use " 822 "generate instead." 823 ) 824 return ( --> 825 self.generate( 826 [prompt], 827 stop=stop, 828 callbacks=callbacks, 829 tags=tags, 830 metadata=metadata, 831 kwargs, 832 ) 833 .generations[0][0] 834 .text 835 )

File D:\generative_ai_with_langchain\pyenv\Lib\site-packages\langchain\llms\, in BaseLLM.generate(self, prompts, stop, callbacks, tags, metadata, kwargs) 612 raise ValueError( 613 "Asked to cache, but no cache found at langchain.cache." 614 ) 615 run_managers = [ 616 callback_manager.on_llm_start( 617 dumpd(self), [prompt], invocation_params=params, options=options 618 )[0] 619 for callback_manager, prompt in zip(callback_managers, prompts) 620 ] --> 621 output = self._generate_helper( 622 prompts, stop, run_managers, bool(new_arg_supported), kwargs 623 ) 624 return output 625 if len(missing_prompts) > 0:

File D:\generative_ai_with_langchain\pyenv\Lib\site-packages\langchain\llms\, in BaseLLM._generate_helper(self, prompts, stop, run_managers, new_arg_supported, **kwargs) 521 for run_manager in run_managers: 522 run_manager.on_llm_error(e) --> 523 raise e 524 flattened_outputs = output.flatten() 525 for manager, flattened_output in zip(run_managers, flattened_outputs):

File D:\generative_ai_with_langchain\pyenv\Lib\site-packages\langchain\llms\, in BaseLLM._generate_helper(self, prompts, stop, run_managers, new_arg_supported, kwargs) 500 def _generate_helper( 501 self, 502 prompts: List[str], (...) 506 kwargs: Any, 507 ) -> LLMResult: 508 try: 509 output = ( --> 510 self._generate( 511 prompts, 512 stop=stop, 513 # TODO: support multiple run managers 514 run_manager=run_managers[0] if run_managers else None, 515 **kwargs, 516 ) 517 if new_arg_supported 518 else self._generate(prompts, stop=stop) 519 ) 520 except (KeyboardInterrupt, Exception) as e: 521 for run_manager in run_managers:

File D:\generative_ai_with_langchain\pyenv\Lib\site-packages\langchain\llms\, in LLM._generate(self, prompts, stop, run_manager, kwargs) 997 new_arg_supported = inspect.signature(self._call).parameters.get("run_manager") 998 for prompt in prompts: 999 text = ( -> 1000 self._call(prompt, stop=stop, run_manager=run_manager, kwargs) 1001 if new_arg_supported 1002 else self._call(prompt, stop=stop, **kwargs) 1003 ) 1004 generations.append([Generation(text=text)]) 1005 return LLMResult(generations=generations)

File D:\generative_ai_with_langchain\pyenv\Lib\site-packages\langchain\llms\, in HuggingFaceHub._call(self, prompt, stop, run_manager, **kwargs) 110 response = self.client(inputs=prompt, params=params) 111 if "error" in response: --> 112 raise ValueError(f"Error raised by inference API: {response['error']}") 113 if self.client.task == "text-generation": 114 # Text generation return includes the starter text. 115 text = response[0]["generated_text"][len(prompt) :]

ValueError: Error raised by inference API: Service Unavailable Image1

benman1 commented 1 week ago

Hi @amscosta. Sorry again for responding with delay. The huggingfacehub service is sometimes a bit unreliable, I found. I think I left a comment in the book to this effect. In retrospect, I should have skipped this example in the book, but I was hoping it'd get better. You can self host or use a GCP service, which is free and should be a good experience.