Failed to calculate number of tokens, falling back to approximate count Error: Unknown model

Checked other resources

[X] I added a very descriptive title to this issue.
[X] I searched the LangChain.js documentation with the integrated search.
[X] I used the GitHub search to find a similar question and didn't find it.
[X] I am sure that this is a bug in LangChain.js rather than my code.
[X] The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

Does anyone know how to implement a custom tokenizer? At the moment if you try to use any chat model other than ChatOpenAI you get this error:

Failed to calculate number of tokens, falling back to approximate count Error: Unknown model

It seems with OpenAI the tiktoken tokenizer is downloaded from a remote url and used. But for all other models it seems like its up to us to implement a custom one. I can't find any instructions in the docs on how to do this.

I am trying to use the ChatTogetherAI class to load the Llama 3 model. There is this local tokenizer for llama 3 I can use:

https://github.com/belladoreai/llama3-tokenizer-js

but I don't know how to actually hook this up with my chain:

this.model = new ChatTogetherAI({
  model: 'meta-llama/Meta-Llama-3-8B-Instruct-Turbo',
  apiKey: 'xxx',
  temperature: 1.0,
  maxRetries: 3,
  timeout: 10 * 1_000,
  maxTokens: MAX_OUTPUT_TOKENS,
  verbose: VERBOSE,
  streaming: this.streaming
});

this.history = new ChatMessageHistory([
   ....
]);

this.memory = new ConversationSummaryBufferMemory({
  llm: this.model,
  chatHistory: this.history,
  memoryKey: 'chat_history',
  inputKey: 'user_message',
  outputKey: 'response',
  maxTokenLimit: 20_000,
  returnMessages: true
});

this.chain = new ConversationChain({
  llm: this.model,
  memory: this.memory,
  prompt: promptChain,
  outputKey: 'response'
});

Error Message and Stack Trace (if applicable)

No response

Description

When using any custom model other than open ai:

Failed to calculate number of tokens, falling back to approximate count Error: Unknown model

Docs not clear about how to add local tokenizer for other providers.

System Info

langchain": "^0.2.5" OS: Mac Node: v20.9.0 Yarn: 1.22.22

To resolve the token calculation error and integrate the llama3Tokenizer with your ChatTogetherAI setup, follow these steps:

Install the llama3-tokenizer-js package:
```
npm install llama3-tokenizer-js
```
Modify the ChatTogetherAI class to accept a custom tokenizer: Update the ChatTogetherAI class to use the custom tokenizer for tokenization.
Pass the custom tokenizer to the ChatTogetherAI instance: Ensure that the llama3Tokenizer is used in your chain setup.

Here is the detailed implementation:

Step 1: Install the `llama3-tokenizer-js` Package

npm install llama3-tokenizer-js

Step 2: Modify the `ChatTogetherAI` Class

Update the ChatTogetherAI class to accept a custom tokenizer and use it for tokenization.

import llama3Tokenizer from 'llama3-tokenizer-js';

interface TogetherAIInputs extends BaseLLMParams {
  // ... other parameters
  tokenizer?: any; // Add this line to accept a custom tokenizer
}

export class TogetherAI extends LLM<TogetherAICallOptions> {
  // ... other properties
  private tokenizer: any; // Add this line to store the custom tokenizer

  constructor(inputs: TogetherAIInputs) {
    super(inputs);
    // ... other initializations
    this.tokenizer = inputs.tokenizer ?? defaultTokenizer; // Use the custom tokenizer or a default one
  }

  private constructBody(prompt: string, options?: this["ParsedCallOptions"]) {
    const tokenizedPrompt = this.tokenizer ? this.tokenizer.encode(prompt) : prompt; // Use the custom tokenizer
    const body = {
      model: options?.model ?? options?.modelName ?? this?.model,
      prompt: tokenizedPrompt,
      temperature: this?.temperature ?? options?.temperature,
      top_k: this?.topK ?? options?.topK,
      top_p: this?.topP ?? options?.topP,
      repetition_penalty: this?.repetitionPenalty ?? options?.repetitionPenalty,
      logprobs: this?.logprobs ?? options?.logprobs,
      stream_tokens: this?.streaming,
      safety_model: this?.safetyModel ?? options?.safetyModel,
      max_tokens: this?.maxTokens ?? options?.maxTokens,
      stop: this?.stop ?? options?.stop,
    };
    return body;
  }

  // ... other methods
}

Step 3: Pass the Custom Tokenizer to the `ChatTogetherAI` Instance

Update your chain setup to pass the llama3Tokenizer to the ChatTogetherAI instance.

import llama3Tokenizer from 'llama3-tokenizer-js';

this.model = new ChatTogetherAI({
  model: 'meta-llama/Meta-Llama-3-8B-Instruct-Turbo',
  apiKey: 'xxx',
  temperature: 1.0,
  maxRetries: 3,
  timeout: 10 * 1_000,
  maxTokens: MAX_OUTPUT_TOKENS,
  verbose: VERBOSE,
  streaming: this.streaming,
  tokenizer: llama3Tokenizer // Pass the custom tokenizer here
});

this.history = new ChatMessageHistory([
   // ...
]);

this.memory = new ConversationSummaryBufferMemory({
  llm: this.model,
  chatHistory: this.history,
  memoryKey: 'chat_history',
  inputKey: 'user_message',
  outputKey: 'response',
  maxTokenLimit: 20_000,
  returnMessages: true
});

this.chain = new ConversationChain({
  llm: this.model,
  memory: this.memory,
  prompt: promptChain,
  outputKey: 'response'
});

By following these steps, you should be able to integrate the llama3Tokenizer with your ChatTogetherAI setup and resolve the token calculation error ^[1].

To continue talking to Dosu, mention @dosu.

@dosu should I be making this change to the TogetherAI class or the ChatTogetherAI class?

export class TogetherAI extends LLM<TogetherAICallOptions> {
  // etc...

You should be making this change to the ChatTogetherAI class.

Supporting evidence:

import { ChatTogetherAI } from "@langchain/community/chat_models/togetherai";

Additional context from the documentation:

# ChatTogetherAI

## Setup

1. Create a TogetherAI account and get your API key [here](https://api.together.xyz/).
2. Export or set your API key inline. The ChatTogetherAI class defaults to `process.env.TOGETHER_AI_API_KEY`.

You can use models provided by TogetherAI as follows:

Further details from the source code:

/**
 * Wrapper around TogetherAI API for large language models fine-tuned for chat
 *
 * TogetherAI API is compatible to the OpenAI API with some limitations. View the
 * full API ref at:
 * @link {https://docs.together.ai/reference/chat-completions}
 *
 * To use, you should have the `TOGETHER_AI_API_KEY` environment variable set.
 * @example
 * ```typescript
 * const model = new ChatTogetherAI({
 *   temperature: 0.9,
 *   apiKey: process.env.TOGETHER_AI_API_KEY,
 * });
 *
 * const response = await model.invoke([new HumanMessage("Hello there!")]);
 * console.log(response);
 * ```
 */
export class ChatTogetherAI extends ChatOpenAI<ChatTogetherAICallOptions> {
  // Class implementation
}

The TogetherAI class is used for general language model tasks and has a different set of parameters and methods. Here is a snippet from its implementation:

export class TogetherAI extends LLM<TogetherAICallOptions> {
  // Class implementation
}

^[1]^[2]^[3]^[4]

To continue talking to Dosu, mention @dosu.

langchain-ai / langchainjs