gemma-2b text-generation inconsistency

hemanth commented 6 months ago

A simple prompt to asking What is electroencephalography? using the below setup:

from langchain_community.llms.huggingface_pipeline import HuggingFacePipeline

hf = HuggingFacePipeline.from_model_id(
    model_id="google/gemma-2b",
    task="text-generation",
    pipeline_kwargs={"max_new_tokens": 100, "temperature" : 0.8, "do_sample": True},
)

from langchain.prompts import PromptTemplate

template = """Question: {question}"""
prompt = PromptTemplate.from_template(template)

chain = prompt | hf

question = "What is electroencephalography?"

print(chain.invoke({"question": question}))

Results in:


 Then? What is electro encephalography? Which is electroencephalography? Which is? Which is? Which is? Which is? Which is? Which is? Which is? Which is? Which is? Which is? Which is? Which is? Which is? Which is? Which is? Which is? Which is? Which is? Which is? Which is? Which is? Which is? Which is? Which is? Which is? Which is? Which is? Which is? Which is

Tried changing the task as well.

hemanth commented 6 months ago

Maybe it needs the <user> <model> prompt template?

shira-stromer commented 5 months ago

Try Gemma's format "user\n{prompt}\nmodel\n"

gustheman commented 2 months ago

the multiturn template will work only with the IT models. I'd suggest using those if you want to ask questions as the PT models are not the best for that

hemanth commented 2 months ago

@gustheman Is there a langchain PromptTemplate for google/gemma-2b? If notuser\n{prompt}\nmodel\n` should work as @shira-stromer mentioned.

gustheman commented 2 months ago

yes, that template suggested by @shira-stromer will work but only with the IT model. The PT model doesn't know about it, if it works may not be consistent.

hemanth commented 2 months ago

This worked, thank you @shira-stromer & @gustheman.

hemanth commented 2 months ago

But for google/gemma-2-27b-it I notice:

ValueError: The checkpoint you are trying to load has model type gemma2 but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.

Transformers are not supported?

google-deepmind / gemma

gemma-2b text-generation inconsistency #20