microsoft / LLMLingua

To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.
https://llmlingua.com/
MIT License
4.42k stars 241 forks source link

Which version of openai should be installed to reproduce gsm8k with llmlingua? #17

Open LYH-YF opened 9 months ago

LYH-YF commented 9 months ago

I installed the version 0.27.4 for runing the code examples/CoT.ipynb some error raised when running the following line

request_data = {
    "prompt": prompt,
    "max_tokens": 400,
    "temperature": 0,
    "top_p": 1,
    "n": 1,
    "stream": False,
    "stop": "\n\n",
}
response = openai.Completion.create(
    model="gpt-3.5-turbo-0301",
    **request_data,
)
print(json.dumps(response, indent=4))
openai.error.InvalidRequestErrorpython-BaseException
: This is a chat model and not supported in the v1/completions endpoint. Did you mean to use v1/chat/completions?

I update openai to version 0.28.1, the error also exists. Updating newer version doesn't work. So I change the code according to the error. It seems gpt-3.5-torbo-0301 only used for ChatCompletion.

request_data = {
    "messages": [{"role": "system", "content": ""}, {"role": "user", "content": prompt}],
    "max_tokens": 400,
    "temperature": 0,
    "top_p": 1,
    "n": 1,
    "stream": False,
    "stop": "\n\n",
}
response = openai.ChatCompletion.create(
    model="gpt-3.5-turbo-0301",
    **request_data,
)

The finally output is 0.439, far from 0.78+.

num_q 1319 correct 579 ratio 0.4390

Is there any suggestion for me?

  1. Should I use older openai or newer openai?
  2. Older openai may not support newer models, so I prefer to use newer openai (for more widely testing) to reproduce the result. But the result of gpt-3.5-torbo-0301 seems not good.
iofu728 commented 9 months ago

Hi @LYH-YF, the GSM8K experiment is based on the GPT-3.5-Turbo-0301 completion model.

Due to recent changes in OpenAI's API, the 3.5-turbo-0301 completion mode is no longer available, but it can be obtained through Azure OpenAI.

In addition, the reason for the poor performance of the chat mode is

  1. the generation is stopped too early, please remove the "stop": "\n\n" parameter;
  2. the 0301-chat mode's ability to follow instructions is weak, this area needs more experiments, I will follow up on this content after the rebuttal.
LYH-YF commented 9 months ago

Thanks for your reply. I removed the parameter stop, and the result reached at 0.68+. So there may be a gap of about 0.10 between chat mode and completion mode (GPT-3.5-Turbo-0301).

lxz12 commented 8 months ago

Hello, can I have a look at your COT.ipynb file? My openai is version 1.0.0 and I am using:

import openai
openai.api_key = "sk-XXX"

as well as

import json
instruction = "Please reference the following examples to answer the math question,\n"
prompt = instruction + prompt_complex + "\n\nQuestion: " + question

request_data = {
     "messages": [{"role": "system", "content": ""}, {"role": "user", "content": prompt}],
     "max_tokens": 400,
     "temperature": 0,
     "top_p": 1,
     "n": 1,
     "stream": False,
     "stop": "\n\n",
}
response = openai.ChatCompletion.create(
     model="gpt-3.5-turbo-0301",
     **request_data,
)

But an error was reported

APIRemovedInV1:

You tried to access openai.ChatCompletion, but this is no longer supported in openai>=1.0.0 - see the README at https://github.com/openai/openai-python for the API.

You can run `openai migrate` to automatically upgrade your codebase to use the 1.0.0 interface.

Alternatively, you can pin your installation to the old version, e.g. `pip install openai==0.28`

A detailed migration guide is available here: https://github.com/openai/openai-python/discussions/742

I don't know how to solve it, so can you please send me the code you changed? Thank you very much!