MohammadrezaPourreza / Few-shot-NL2SQL-with-prompting

MIT License
305 stars 61 forks source link

How to run DIN-SQL with GPT-3.5/Davinci #1

Closed ghost closed 1 year ago

ghost commented 1 year ago

Hi Mohammadreza,

I wanted to run the script with GPT-3.5 as I don't have access to GPT-4 (on the waitlist). After changing the model variable in GPT4_generation to gpt-3.5-turbo and running the script, I'm getting the following error:

openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens. However, your messages resulted in 4504 tokens. Please reduce the length of the messages.

My understanding is that we need to reduce the prompt size as GPT-3.5 doesn't support the same number of tokens as GPT-4 (8096). The interesting thing is that Davinci has around the same context size as GPT-3.5 (4096), so the question boils down to figuring out how to run the script with Davinci, which was one of the LLMs used to test DIN-SQL in the paper.

Thanks for your help.

MohammadrezaPourreza commented 1 year ago

The reported results with the Davinci model are for the CodeX Davinci model which is accessible through the Microsoft Azure. The CodeX Davinci model has a larger context window of 8001 token. You can see the context window sizes of models from this link: https://platform.openai.com/docs/models/codex