OpenAI Compatible API Call Not Possible

zanderjiang commented 3 months ago

Describe the issue as clearly as possible:

I'm trying to use outlines with an OpenAI Compatible API. I also need to set some arguments to the LLM call. Right now, the program executes with no error, but it does not output anything and remains stuck. I suspect that there's an issue with the API call made, but I cannot see the API error so it is very difficult to identify the problem.

Steps/code to reproduce the bug:

import os
from openai import AsyncOpenAI
from outlines import models, generate
from outlines.models.openai import OpenAIConfig
import tiktoken

client = AsyncOpenAI(
    api_key="API_KEY",
    base_url="OPENAI_COMPATIBLE_URL"
)
config = OpenAIConfig("compatible-model", temperature=0.1, top_p=0.9)
enc = tiktoken.get_encoding("cl100k_base")

model = models.openai(client, config, enc)

generator = generate.choice(model, ["skirt", "dress", "pen", "jacket"])
answer = generator("Pick the odd word out: skirt, dress, pen, jacket")
print(answer)

Expected result:

pen

Error message:

No error message, the function call gets stuck until timeout

Outlines/Python version information:

Version information

(command output here)

Context for the issue:

No response

lapp0 commented 2 months ago

lapp0 commented 1 month ago

To generate choices with models.openai, Outlines tokenizes the choices and sets the logit_bias (token filtering) API argument such that there is ~0% chance of any other token being selected.

There are two issues here:

Upstream top_p Handling: When top_p is set, OpenAI appears to completely ignore the logit_bias parameter.

Fix: We need an informative warning if top_p is set and user is using OpenAI.

Outlines Method: Outlines allows any token from any choice at any point. When I ran your reproduction script, because "jacket" tokenizes as ["j", "acket"], "j" is legal for every token. Therefore ChatGPT responded with "jj" in one case.

Fix: We could make requests token-by-token, however resolving https://github.com/outlines-dev/outlines/pull/1060 would be much cleaner

dottxt-ai / outlines