Receiving a unclear KeyError: 'error' when using Langchain HuggingFaceTextInference on Runpod GPU

axiom9 commented 1 year ago

System Info

Langchain version: 0.0278 Python: 3.10 Runpod version: 1.2.0

I'm experiencing some issues when running Runpod (TGI gpu cloud) and langchain, primarily when I try to run the chain. For reference, I'm using TheBloke/VicUnlocked-30B-LoRA-GPTQ model in TGI on Runpod A4500 GPU cloud. I initialize the pod from the UI and connect to it with the runpod-python library (version 1.2.0) in my python 3.10 environment. My prompt template is as follows:

prompt = PromptTemplate(
    input_variables=['instruction','summary'],
    template="""### Instruction:
    {instruction}

    ### Input:
    {summary}

    ### Response:

    """)

The instruction is a simple instruction to extract relevant insights from a summary. My LLM is instantiated as such: inference_server_url = f'https://{pod["id"]}-{port}.proxy.runpod.net' ### note: the pod and port is defined previously. llm = HuggingFaceTextGenInference(inference_server_url=inference_server_url)

And I am trying to run the model as such:

summary = ... # summary here
instruction = ... #instruction here
chain.run({"instruction": instruction, "summary": summary}) #**_Note: Error occurs from this line!!_**

But I get this error:

File ~/anaconda3/envs/py310/lib/python3.10/site-packages/langchain/chains/base.py:282, in Chain.__call__(self, inputs, return_only_outputs, callbacks, tags, metadata, include_run_info)
    280 except (KeyboardInterrupt, Exception) as e:
    281     run_manager.on_chain_error(e)
--> 282     raise e
    283 run_manager.on_chain_end(outputs)
    284 final_outputs: Dict[str, Any] = self.prep_outputs(
    285     inputs, outputs, return_only_outputs
...
---> 81 message = payload["error"]
     82 if "error_type" in payload:
     83     error_type = payload["error_type"]

KeyError: 'error'

Any ideas?

Who can help?

No response

Information

[ ] The official example notebooks/scripts
[X] My own modified scripts

Related Components

[X] LLMs/Chat Models
[ ] Embedding Models
[X] Prompts / Prompt Templates / Prompt Selectors
[ ] Output Parsers
[ ] Document Loaders
[ ] Vector Stores / Retrievers
[ ] Memory
[ ] Agents / Agent Executors
[ ] Tools / Toolkits
[X] Chains
[X] Callbacks/Tracing
[ ] Async

Reproduction

My prompt template is as follows: prompt = PromptTemplate( input_variables=['instruction','summary'], template="""### Instruction: {instruction}

### Input:
{summary}

### Response:

""")

The instruction is a simple instruction to extract relevant insights from a summary. My LLM is instantiated as such: inference_server_url = f'https://{pod["id"]}-{port}.proxy.runpod.net' ### note: the pod and port is defined previously. llm = HuggingFaceTextGenInference(inference_server_url=inference_server_url)

And I am trying to run the model as such: summary = ... # summary here instruction = ... #instruction here chain.run({"instruction": instruction, "summary": summary})

And then running chain.run I get the error as mentioned above.

Expected behavior

What's expected is that I should be receiving the output from the runpod GPU cloud that is hosting the model, as per this guide that I am following: https://colab.research.google.com/drive/10BJcKRBtMlpm2hsS2antarSRgEQY3AQq#scrollTo=lyVYLW2thTMg

dosubot[bot] commented 1 year ago

🤖

Hello,

Thank you for providing detailed information about the issue you're facing. The KeyError: 'error' you're encountering is likely due to an unexpected response from the HuggingFaceHub's Inference API. The error suggests that the response from the API call in the embed_documents or embed_query methods is not as expected.

However, without the exact line where the error occurs, or the error message details, it's difficult to pinpoint the exact cause of the error. It could be due to an invalid repo_id, an unsupported task, missing huggingfacehub_api_token, or an issue with the input texts.

Please ensure that:

You have a valid huggingfacehub_api_token set either as an environment variable or passed as a parameter to the HuggingFaceHubEmbeddings constructor.
The repo_id you're using starts with "sentence-transformers", as currently only 'sentence-transformers' embedding models are supported.
The task you're using is "feature-extraction", as it's the only supported task.
The texts you're trying to embed do not contain any characters or formats that might cause issues with the API.

If the error persists, please provide more details about the error message and the line where it occurs. This will help us to better understand the issue and provide a more accurate solution.

For reference, you can check the following sources:

I hope this helps. Please let me know if you have any other questions or need further clarification.

Best, Dosu

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

dosubot[bot] commented 10 months ago

Hi, @axiom9! I'm Dosu, and I'm here to help the LangChain team manage our backlog. I wanted to let you know that we are marking this issue as stale.

Based on my understanding of the issue, you are experiencing a KeyError: 'error' when running the Langchain HuggingFaceTextInference on Runpod GPU. It seems that the error occurs when trying to run the model using the chain.run() method.

I see that I have provided some troubleshooting steps and suggested that the error is likely due to an unexpected response from the HuggingFaceHub's Inference API. I have also shared some relevant source files for reference.

Before we proceed, I wanted to check with you if this issue is still relevant to the latest version of the LangChain repository. If it is, please let us know by commenting on this issue. Otherwise, feel free to close the issue yourself, or the issue will be automatically closed in 7 days.

Thank you for your understanding and cooperation!

langchain-ai / langchain