hegelai / prompttools

Open-source tools for prompt testing and experimentation, with support for both LLMs (e.g. OpenAI, LLaMA) and vector databases (e.g. Chroma, Weaviate, LanceDB).
http://prompttools.readthedocs.io
Apache License 2.0
2.65k stars 230 forks source link

chat model and not supported in the v1/completions endpoint #115

Closed shrijayan closed 9 months ago

shrijayan commented 9 months ago

🐛 Describe the bug

Code

harness = PromptTemplateExperimentationHarness(
    OpenAICompletionExperiment,
    "gpt-3.5-turbo-0301", # instead of text-davinci-003 any other model I give as input it rises the error
    prompt_templates,
    user_inputs,
    # Zero temperature is better for
    # structured outputs
    model_arguments={"temperature": 0},
)

Error

WARNING:prompttools.requests.retries:Retrying prompttools.requests.request_queue.RequestQueue._run in 4.0 seconds as it raised NotFoundError: Error code: 404 - {'error': {'message': 'This is a chat model and not supported in the v1/completions endpoint. Did you mean to use v1/chat/completions?', 'type': 'invalid_request_error', 'param': 'model', 'code': None}}.

NivekT commented 9 months ago

Hi @shrijayan,

The first argument OpenAICompletionExperiment is intended to work with non-chat models, such as text-davinci-003. If you want to use gpt-3.5-turbo-0301, you should use OpenAIChatExperiment as the first argument.

There is also ChatPromptTemplateExperimentationHarness that you can use, but similarly, your model name (second argument) and experiment (first argument) should match.

NivekT commented 9 months ago

Feel free to re-open this issue if you have other questions!

shrijayan commented 9 months ago

Try 1

harness = PromptTemplateExperimentationHarness(
    OpenAIChatExperiment,
    "gpt-3.5-turbo",
    prompt_templates,
    user_inputs,
    # Zero temperature is better for
    # structured outputs
    model_arguments={"temperature": 0},
)

harness.run()
harness.visualize()

Error

WARNING:prompttools.requests.retries:Retrying prompttools.requests.request_queue.RequestQueue._run in 3.0 seconds as it raised BadRequestError: Error code: 400 - {'error': {'message': "'Generate valid JSON from the following input: The task is to count all the words in a string' is not of type 'array' - 'messages'", 'type': 'invalid_request_error', 'param': None, 'code': None}}.
Exception in thread Thread-14 (_process_queue):
Traceback (most recent call last):
  File "/usr/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/usr/local/lib/python3.10/dist-packages/sentry_sdk/integrations/threading.py", line 72, in run
    reraise(*_capture_exception())
  File "/usr/local/lib/python3.10/dist-packages/sentry_sdk/_compat.py", line 115, in reraise
    raise value
  File "/usr/local/lib/python3.10/dist-packages/sentry_sdk/integrations/threading.py", line 70, in run
    return old_run_func(self, *a, **kw)
  File "/usr/lib/python3.10/threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/lib/python3.10/dist-packages/prompttools/requests/request_queue.py", line 37, in _process_queue
    self._do_task(fn, args)
  File "/usr/local/lib/python3.10/dist-packages/prompttools/requests/request_queue.py", line 48, in _do_task
    res = self._run(fn, args)
  File "/usr/local/lib/python3.10/dist-packages/tenacity/__init__.py", line 289, in wrapped_f
    return self(f, *args, **kw)
  File "/usr/local/lib/python3.10/dist-packages/tenacity/__init__.py", line 379, in __call__
    do = self.iter(retry_state=retry_state)
  File "/usr/local/lib/python3.10/dist-packages/tenacity/__init__.py", line 325, in iter
    raise retry_exc.reraise()
  File "/usr/local/lib/python3.10/dist-packages/tenacity/__init__.py", line 158, in reraise
    raise self.last_attempt.result()
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 451, in result
    return self.__get_result()
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
    raise self._exception
  File "/usr/local/lib/python3.10/dist-packages/tenacity/__init__.py", line 382, in __call__
    result = fn(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/prompttools/requests/request_queue.py", line 59, in _run
    result = fn(**args)
  File "/usr/local/lib/python3.10/dist-packages/openai/_utils/_utils.py", line 303, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/openai/resources/chat/completions.py", line 604, in create
    return self._post(
  File "/usr/local/lib/python3.10/dist-packages/openai/_base_client.py", line 1088, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
  File "/usr/local/lib/python3.10/dist-packages/openai/_base_client.py", line 853, in request
    return self._request(
  File "/usr/local/lib/python3.10/dist-packages/openai/_base_client.py", line 930, in _request
    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'error': {'message': "'Generate valid JSON from the following input: The task is to count all the words in a string' is not of type 'array' - 'messages'", 'type': 'invalid_request_error', 'param': None, 'code': None}}

Try 2

harness = ChatPromptTemplateExperimentationHarness(
    OpenAIChatExperiment,
    "gpt-3.5-turbo",
    prompt_templates,
    user_inputs,
    # Zero temperature is better for
    # structured outputs
    model_arguments={"temperature": 0},
)

Error

TypeError                                 Traceback (most recent call last)
[<ipython-input-20-ff6c3958f92b>](https://localhost:8080/#) in <cell line: 1>()
----> 1 harness.run()
      2 harness.visualize()

2 frames
[/usr/local/lib/python3.10/dist-packages/prompttools/harness/chat_prompt_template_harness.py](https://localhost:8080/#) in _render_messages_openai_chat(message_template, user_input, environment)
     20 def _render_messages_openai_chat(message_template: list[dict], user_input: dict, environment):
     21     rendered_message = deepcopy(message_template)
---> 22     sys_msg_template = environment.from_string(rendered_message[0]["content"])
     23     user_msg_template = environment.from_string(rendered_message[-1]["content"])
     24     rendered_message[0]["content"] = sys_msg_template.render(**user_input)

TypeError: string indices must be integers
NivekT commented 9 months ago

When you use ChatPromptTemplateExperimentationHarness, you need to use prompt templates that are in the chat format.

NivekT commented 9 months ago

Here is an example:

message_templates = [
    [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Who was the {{input}} president?"},
    ],
    [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Who was the {{input}} vice president?"},
    ]
]

user_inputs = [{"input": "first"}, {"input": "second"}]

harness = ChatPromptTemplateExperimentationHarness(OpenAIChatExperiment,
                                                   "gpt-3.5-turbo",
                                                   message_templates,
                                                   user_inputs,
                                                   model_arguments=None)