Closed hyp530 closed 10 months ago
The number of tokens deviates a lot comparing to https://platform.openai.com/tokenizer.
package main import ( "fmt" "github.com/pkoukk/tiktoken-go" ) func main() { text := "这是一个测试" tke, _ := tiktoken.GetEncoding("cl100k_base") token := tke.Encode(text, nil, nil) fmt.Println(len(token)) // Result: 4 }
The result is 10 as generated by OpenAI Tokenizer .
OpenAI Tokenizer used gpt-3, it model is p50k_base.
p50k_base
The number of tokens deviates a lot comparing to https://platform.openai.com/tokenizer.
The result is 10 as generated by OpenAI Tokenizer .