pkoukk / tiktoken-go

go version of tiktoken
MIT License
601 stars 67 forks source link

help for count token #3

Closed 137-rick closed 1 year ago

137-rick commented 1 year ago

hi sir this is very good tools! useful can you support an token counter? or any suggestion how can i do this?

ref: https://github.com/openai/openai-cookbook/blob/main/examples/How_to_count_tokens_with_tiktoken.ipynb chapter: 6. Counting tokens for chat API calls

aphilas commented 1 year ago
const tokenizerEncoding = "cl100k_base"

func countTokens(prompt string) (int, error) {
    enc, err := tiktoken.GetEncoding(tokenizerEncoding)
    if err != nil {
        return 0, err
    }

    tokens := enc.Encode(prompt, nil, nil)

    return len(tokens), nil
}

// main_test.go
func TestCountTokens(t *testing.T) {
    const prompt = "hello world!你好,世界!"
    const want = 10

    got, err := countTokens(prompt)
    if err != nil {
        t.Error(err)
    }

    if got != want {
        t.Errorf("got %v want %v", got, want)
    }
}

Transliteration of

def num_tokens_from_string(string: str, encoding_name: str) -> int:
    """Returns the number of tokens in a text string."""
    encoding = tiktoken.get_encoding(encoding_name)
    num_tokens = len(encoding.encode(string))
    return num_tokens

from the link you shared ^1.

Did I miss something?