Llama-3 models hallucinate when `stop` is not found in output.

Infrared1029 commented 4 months ago

import os
from together import Together

client = Together(api_key="my_key")

response = client.chat.completions.create(
    model="meta-llama/Llama-3-70b-chat-hf",
    messages=[{"role": "user", "content": "write `bar` then `foo` three times"}],
    max_tokens=200,
    stop=["test"]
)
print(response.choices[0].message.content)

outputs:

Here is the output:

bar
foo
foo
fooassistant

Here is the output:

bar
foo
foo
fooassistant

Here is the output:
...

seems to happen with llama 3 models only? llama 2 does fine.

orangetin commented 4 months ago

can you also try passing in <|eot_id|> in the list of stop tokens? I believe what's happening is that the tokenizer is ignoring it.

Infrared1029 commented 4 months ago

can you also try passing in <|eot_id|> in the list of stop tokens? I believe what's happening is that the tokenizer is ignoring it.

yeah works when i add it.

  Sure! Here's `bar`, followed by `foo` three times:

bar
foo
foo
foo

orangetin commented 4 months ago

sweet! will patch this server side too.

togethercomputer / together-python

Llama-3 models hallucinate when `stop` is not found in output. #137