The default `max_tokens = 300` leads to reply being cut off

Hello there. Hope you're doing well. Thank you for creating chatgpt.nvim. The integration chatgpt into neovim is a match made in heaven! However, I encountered the following problem.

Description

With the default config like below, chatgpt would stop generating before finishing (i.e., reply gets cut off):

openai_params = {
     model = "gpt-3.5-turbo",
     frequency_penalty = 0,
     presence_penalty = 0,
     max_tokens = 300,
     temperature = 0,
     top_p = 1,
     n = 1,
   },

`max_tokens = 300`

Put the following prompt to chatgpt and it'll stop generating mid-way:

Generate a game of tic-tac-toe in Python. Add very detailed comment on every line of code.

Response:

Sure! Here's game of tic-t-toe in with detailed comments explaining line of code:
python
# Create a3x3 for tic-tace
board = ' for _ in range3)] for _ range(3)]
 Function to print current state of the board
def print_board():
   ('---------')
    for row in board:
       ('|',='')
 for cell in row:
            print(cell, end='|')
        print('\n---------')
# Function to check if a player won
def check_win):
    # Check    for row in board:
        if all == player for cell in            return True
    # Check columns
    for col in range():
        if all(board[rowcol] == player in range(3)):
            return True
    Check diagonals    if board[0][] == board[11] == board[2] ==:
        return True
   [0][2] == board[11] == board[20] == player        return True
 return False
# to check if the board full
def check():
    for row in:
        if ' in row:
            False
    return True
# Function to the game
def_game():

You can see the response is incomplete.

`max_tokens = 4097`

Response:

This model's maximum context length is 4097 tokens. However, you requested 4127 tokens (30 in the messages, 4097 in the completion). Please reduce the length of the messages or completion.

`max_tokens = 100`

Response:

Sure Here's a game tic-tac-toe Python with detailed comments each line of code```python
# a 3x3 for-tac-toeboard = [[' ' for _ in range(3)] for _ in(3)]
# to print current state of the board
def print_board    print('---------')
    for row in board:
 print('|', end='')
        for cell in:
            print(cell

`max_tokens = 3000`

If I set max_tokens = 3000, I get a complete response.

Question

Given gpt-3.5-turbo model's maximum context length is 4097 tokens, what is the maximum or optimal max_tokens value I can set to reduce the chance of "reply gets cut off"?

I can't simply set max_tokens = 4197 because I don't know the tokens I need in the messages (i.e., This model's maximum context length is 4097 tokens. However, you requested 4127 tokens (30 in the messages, 4097 in the completion). Please reduce the length of the messages or completion..) I'm currently setting max_tokens = 3000 but sometimes when the reply is long, it gets cut off too.

Any input is much appreciated. Thank you!

jackMort / ChatGPT.nvim