GPT 3.5 turbo instruct model: Failed to calculate number of tokens, falling back to approximate count

Prince-Mendiratta commented 1 year ago

Hello team!

Thank you for the project, this is super helpful.

I was trying out the latest gpt-3.5-turbo-instruct model and was facing this error. This is the code I used:

import { OpenAI } from "langchain/llms/openai";

export async function instructTest() {
  let gpt3Instruct = new OpenAI({
    temperature: 0.69,
    openAIApiKey: process.env.API_KEY,
    modelName: "gpt-3.5-turbo-instruct",
    topP: 1,
    maxTokens: -1,
  });

  return await gpt3Instruct.predict("Tell me why is 1+1 = 3?");
}

This throws the error and the request does not go through:

Failed to calculate number of tokens, falling back to approximate count
/Users/prince/Documents/startup/DB-Sync/node_modules/langchain/node_modules/openai/src/error.ts:75
    }
     ^
BadRequestError: This model's maximum context length is 4097 tokens, however you requested 4102 tokens (12 in your prompt; 4090 for the completion). Please reduce your prompt; or completion length.
    at Function.generate (/Users/prince/Documents/startup/DB-Sync/node_modules/langchain/node_modules/openai/src/error.ts:75:6)
    at OpenAI.makeStatusError (/Users/prince/Documents/startup/DB-Sync/node_modules/langchain/node_modules/openai/src/core.ts:444:21)
    at OpenAI.makeRequest (/Users/prince/Documents/startup/DB-Sync/node_modules/langchain/node_modules/openai/src/core.ts:494:17)
    at processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async /Users/prince/Documents/startup/DB-Sync/node_modules/langchain/dist/llms/openai.cjs:462:29
    at async RetryOperation._fn (/Users/prince/Documents/startup/DB-Sync/node_modules/p-retry/index.js:50:12) {
  status: 400,
  headers: {
    'access-control-allow-origin': '*',
    'alt-svc': 'h3=":443"; ma=86400',
    'cf-cache-status': 'DYNAMIC',
    'cf-ray': '808f46ea5f52f484-BOM',
    connection: 'keep-alive',
    'content-length': '293',
    'content-type': 'application/json',
    date: 'Tue, 19 Sep 2023 05:02:32 GMT',
    'openai-model': 'gpt-3.5-turbo-instruct',
    'openai-organization': 'creatr-cnqnye',
    'openai-processing-ms': '8',
    'openai-version': '2020-10-01',
    server: 'cloudflare',
    'strict-transport-security': 'max-age=15724800; includeSubDomains',
    'x-ratelimit-limit-requests': '3000',
    'x-ratelimit-remaining-requests': '2999',
    'x-ratelimit-reset-requests': '20ms',
    'x-request-id': '8ac4f0da9f403299fa2cd7469ff8f518'
  },
  error: {
    message: "This model's maximum context length is 4097 tokens, however you requested 4102 tokens (12 in your prompt; 4090 for the completion). Please reduce your prompt; or completion length.",
    type: 'invalid_request_error',
    param: null,
    code: null
  },
  code: null,
  param: null,
  type: 'invalid_request_error',
  attemptNumber: 1,
  retriesLeft: 6
}

Do note that this only happens when we use maxTokens: -1. I believe this might be an issue with js-tiktoken.

dosubot[bot] commented 1 year ago

🤖

Hello,

Thank you for reaching out and providing detailed information about the issue you're facing.

The error message "Failed to calculate number of tokens, falling back to approximate count" is thrown when the encodingForModel(modelName) function fails to calculate the number of tokens in the prompt. This could be due to a variety of reasons such as an unsupported model name or an issue with the js-tiktoken library.

The maxTokens: -1 setting you're using is not directly related to this error. The maxTokens value is calculated by subtracting the number of tokens in the prompt from the model context size, which is determined by the getModelContextSize(modelName) function. If the maxTokens value is negative, it means that the prompt contains more tokens than the model can handle. However, this is not the cause of the error you're seeing. The error is specifically related to the calculation of the number of tokens in the prompt.

In the context shared, the maxTokens value is set to -1 in the invocationParams function of the OpenAI class. If maxTokens is -1 and there are multiple inputs, an error is thrown. If there is only one input, the calculateMaxTokens function is called to calculate the maximum number of tokens for the prompt. This function uses the encodingForModel(modelName) function to calculate the number of tokens in the prompt. If this function fails, the error you're seeing is thrown.

I found a similar issue in the langchainjs repository: Failed to calculate number of tokens with tiktoken. The suggested solution is to upgrade LangchainJS to version 0.0.76 or newer, which replaces the WASM tiktoken package with a pure-JS version.

Please try upgrading your LangchainJS version and let me know if the issue persists. If it does, we can further investigate the problem.

Best, Dosu

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

jacoblee93 commented 1 year ago

@dqbd can you have a look at this one?

jacoblee93 commented 1 year ago

Should be fixed!

langchain-ai / langchainjs

GPT 3.5 turbo instruct model: Failed to calculate number of tokens, falling back to approximate count #2668

Sources