Incomplete responses for basic example when used HF with llama and Falcon LLMS

sai-guidizy commented 1 year ago

Issue you'd like to raise.

I am just testing a very basic code as follows using LangChain


from langchain import HuggingFaceHub

from langchain import PromptTemplate, LLMChain
import asyncio

question = "Who won the FIFA World Cup in the year 1994? "

template = """Question: {question}

Answer: Let's think step by step."""

prompt = PromptTemplate(template=template, input_variables=["question"])

llm: HuggingFaceHub = HuggingFaceHub(
                    #repo_id="tiiuae/falcon-7b",
                    repo_id="meta-llama/Llama-2-7b-hf",
                    huggingfacehub_api_token="MY-TOKEN",
                    model_kwargs={
                        "temperature": 0.5,
                         "max_length": 64
                    }
                )

llm_chain = LLMChain(prompt=prompt, llm=llm)

async def run_chain():
    result = await llm_chain.arun(question)
    print('restult with load_qua_chain is....', result)

asyncio.run(run_chain())

Output when tested with "meta-llama/Llama--2-7b-hf" is ...

We have to find the winner of the FIFA World Cup in the year 199

Output when tested with "tiiue/falcon-7b" is... restult with load_qua_chain is.... We know that 1994 is a leap year, and the previous year was 1993.

But the expected answer is not what i am getting and it looks like Either the response is NOT fully getting retrieved from LLM or i am doing some thing wrong. I followed the same code as shown in the landchain documentation at https://python.langchain.com/docs/integrations/llms/huggingface_hub, only thing is i used HF with my API key instead of Open AI.

Can i request any help / suggestion if i have to make any changes pls

Expected behavior:

To get the full response as mentioned in the Lang chain documentation for these examples

hinthornw commented 1 year ago

Have you tried increasing the max_length? 64 isn't many

dosubot[bot] commented 11 months ago

Hi, @sai-guidizy! I'm Dosu, and I'm helping the LangChain team manage their backlog. I wanted to let you know that we are marking this issue as stale.

From what I understand, you were experiencing an issue where the expected answer was not being retrieved fully from the Language Model (LLM) when using HuggingFace with the llama and Falcon LLMS. One user, @hinthornw, suggested increasing the max_length parameter as a possible solution.

Before we close this issue, we wanted to check if it is still relevant to the latest version of the LangChain repository. If it is, please let us know by commenting on this issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days.

Thank you for your contribution to the LangChain repository!

langchain-ai / langchain

Incomplete responses for basic example when used HF with llama and Falcon LLMS #9194

Issue you'd like to raise.

Expected behavior: