run-llama / llama_index

LlamaIndex is a data framework for your LLM applications
https://docs.llamaindex.ai
MIT License
33.37k stars 4.67k forks source link

[Question]: How to count token in Anthropic models? #13952

Open Ninlawat-Puhu opened 1 month ago

Ninlawat-Puhu commented 1 month ago

Question Validation

Question

Hi, I want to know how to count tokens (Embedding Tokens, LLM Prompt Tokens, LLM Completion Tokens, Total LLM Token Count) for Anthropic models?

dosubot[bot] commented 1 month ago

To count tokens (Embedding Tokens, LLM Prompt Tokens, LLM Completion Tokens, Total LLM Token Count) for Anthropic models using the TokenCountingHandler class, follow these steps:

  1. Install LlamaIndex:

    !pip install llama-index
  2. Initialize the TokenCountingHandler:

    import tiktoken
    from llama_index.core.callbacks import CallbackManager, TokenCountingHandler
    from llama_index.llms.openai import OpenAI
    from llama_index.core import Settings
    
    token_counter = TokenCountingHandler(
       tokenizer=tiktoken.encoding_for_model("gpt-3.5-turbo").encode
    )
    
    Settings.llm = OpenAI(model="gpt-3.5-turbo", temperature=0.2)
    Settings.callback_manager = CallbackManager([token_counter])
  3. Handle Events:

    • For LLM events, call on_event_end with the appropriate payload.
    • For Embedding events, call on_event_end with the appropriate payload.
  4. Access Token Counts:

    • total_llm_token_count: Total LLM token count.
    • prompt_llm_token_count: Total LLM prompt token count.
    • completion_llm_token_count: Total LLM completion token count.
    • total_embedding_token_count: Total Embedding token count.

Here is an example code snippet:

from llama_index.core.callbacks import TokenCountingHandler, CBEventType

# Initialize the handler
token_counting_handler = TokenCountingHandler(verbose=True)

# Simulate an LLM event
llm_payload = {
    "prompt": "What is the capital of France?",
    "completion": "The capital of France is Paris."
}
token_counting_handler.on_event_end(CBEventType.LLM, payload=llm_payload)

# Simulate an Embedding event
embedding_payload = {
    "chunks": ["This is a test sentence for embedding."]
}
token_counting_handler.on_event_end(CBEventType.EMBEDDING, payload=embedding_payload)

# Access the token counts
print("Total LLM Token Count:", token_counting_handler.total_llm_token_count)
print("LLM Prompt Token Count:", token_counting_handler.prompt_llm_token_count)
print("LLM Completion Token Count:", token_counting_handler.completion_llm_token_count)
print("Total Embedding Token Count:", token_counting_handler.total_embedding_token_count)

This code will count the tokens for the given LLM and Embedding events and print the respective token counts. For more detailed usage, including handling streaming and advanced token counting, you can refer to the TokenCountingHandler documentation [1][2].

To continue talking to Dosu, mention @dosu.

Ninlawat-Puhu commented 1 month ago

@logan-markewich TokenCountingHandler is not support Anthropic models. Could you recommend how to count tokens other ways?

ppipada commented 1 week ago

@Ninlawat-Puhu It is same as above except for the tokenizer parameter in TokenCountingHandler You need to do:

from llama_index.llms.anthropic import Anthropic #Assuming you have installed Anthropic plugin

token_counter = TokenCountingHandler(
   tokenizer = Anthropic().tokenizer.encode
)

See here: https://docs.llamaindex.ai/en/stable/examples/llm/anthropic/#set-tokenizer