Cainier / gpt-tokens

Calculate the token consumption and amount of openai gpt message
MIT License
104 stars 13 forks source link

Add performance test and minor code refactor #50

Closed linj121 closed 4 months ago

linj121 commented 4 months ago

First of all, I would like to thank the author for this project❤️ , which really makes my life a lot easier. And I also learnt a lot from the coding style and structure

Here are the changes I made:

Performance Test

-> usedTokens time: 0.1168 -> promptUsedTokens time: 0.3037 -> completionUsedTokens time: 0.1459 -> usedUSD time: 1.7766 Total time: 2.343

-> usedTokens time: 0.1139 -> promptUsedTokens time: 0.1359 -> completionUsedTokens time: 0.0042 -> usedUSD time: 0.2422 Total time: 0.4962

-> usedTokens time: 0.0684 -> promptUsedTokens time: 0.0445 -> completionUsedTokens time: 0.0027 -> usedUSD time: 0.3555 Total time: 0.4711

-> usedTokens time: 0.0309 -> promptUsedTokens time: 0.0401 -> completionUsedTokens time: 0.0016 -> usedUSD time: 0.1136 Total time: 0.1862

-> usedTokens time: 0.0555 -> promptUsedTokens time: 0.0387 -> completionUsedTokens time: 0.0023 -> usedUSD time: 0.2473 Total time: 0.3438

-> usedTokens time: 0.0297 -> promptUsedTokens time: 0.0399 -> completionUsedTokens time: 0.0022 -> usedUSD time: 0.1403 Total time: 0.2121

-> usedTokens time: 0.0276 -> promptUsedTokens time: 0.0391 -> completionUsedTokens time: 0.0023 -> usedUSD time: 0.2982 Total time: 0.3672

-> usedTokens time: 0.0424 -> promptUsedTokens time: 0.0535 -> completionUsedTokens time: 0.0024 -> usedUSD time: 0.0955 Total time: 0.1938

-> usedTokens time: 0.0288 -> promptUsedTokens time: 0.0421 -> completionUsedTokens time: 0.0049 -> usedUSD time: 0.1504 Total time: 0.2262

-> usedTokens time: 0.0308 -> promptUsedTokens time: 0.0435 -> completionUsedTokens time: 0.0062 -> usedUSD time: 0.1025 Total time: 0.183

-> usedTokens time: 0.0274 -> promptUsedTokens time: 0.0395 -> completionUsedTokens time: 0.0018 -> usedUSD time: 0.0775 Total time: 0.1462

-> usedTokens time: 0.0263 -> promptUsedTokens time: 0.0389 -> completionUsedTokens time: 0.0019 -> usedUSD time: 0.0803 Total time: 0.1474

-> usedTokens time: 0.027 -> promptUsedTokens time: 0.0391 -> completionUsedTokens time: 0.0018 -> usedUSD time: 0.0952 Total time: 0.1631

-> usedTokens time: 0.0266 -> promptUsedTokens time: 0.0383 -> completionUsedTokens time: 0.0021 -> usedUSD time: 0.0749 Total time: 0.1419

-> usedTokens time: 0.0269 -> promptUsedTokens time: 0.0386 -> completionUsedTokens time: 0.0019 -> usedUSD time: 0.0752 Total time: 0.1426

-> usedTokens time: 0.0271 -> promptUsedTokens time: 0.0393 -> completionUsedTokens time: 0.0018 -> usedUSD time: 8.2488 Total time: 8.317

-> usedTokens time: 0.0362 -> promptUsedTokens time: 0.0972 -> completionUsedTokens time: 0.0649 -> usedUSD time: 0.461 Total time: 0.6593

-> usedTokens time: 0.0333 -> promptUsedTokens time: 0.0798 -> completionUsedTokens time: 0.002 -> usedUSD time: 0.1261 Total time: 0.2412

-> usedTokens time: 0.0322 -> promptUsedTokens time: 0.0702 -> completionUsedTokens time: 0.0019 -> usedUSD time: 0.1382 Total time: 0.2425

-> usedTokens time: 0.0373 -> promptUsedTokens time: 0.0605 -> completionUsedTokens time: 0.0013 -> usedUSD time: 0.0901 Total time: 0.1892

End of Batch 1 >>> Start of Batch 2 >>> Testing performance... Options: {"model":"gpt-3.5-turbo-0613","messages":[{"role":"user","content":"Hello world"}]}, iterations: 10

Testing performance... Options: {"model":"gpt-3.5-turbo-0613","messages":[{"role":"user","content":"Hello world"}]}, iterations: 100

Testing performance... Options: {"model":"gpt-3.5-turbo-0613","messages":[{"role":"user","content":"Hello world"}]}, iterations: 1000

Testing performance... Options: {"model":"gpt-3.5-turbo-0613","messages":[{"role":"user","content":"Hello world"}]}, iterations: 10000

Testing performance... Options: {"model":"gpt-3.5-turbo-0613","messages":[{"role":"user","content":"Hello world"}]}, iterations: 100000

Testing performance... Options: {"model":"gpt-3.5-turbo-0613","messages":[{"role":"user","content":"Hello world"}]}, iterations: 500000

Testing performance... Options: {"model":"gpt-3.5-turbo-0613","messages":[{"role":"user","content":"Hello world"}]}, iterations: 1000000

Statistical Information: Setting: {"options":{"model":"gpt-3.5-turbo-0613","messages":[{"role":"user","content":"Hello world"}]},"iterations":10} Used Tokens for Each Call: 9 Total Execution Time: 1.5428999960422516 Total Number of Iterations: 10 Total Number of Calls per Iteration: 4 Avg Execution Time (per Iteration): 0.15429ms Avg Execution Time (per Call): 0.03857ms Setting: {"options":{"model":"gpt-3.5-turbo-0613","messages":[{"role":"user","content":"Hello world"}]},"iterations":100} Used Tokens for Each Call: 9 Total Execution Time: 14.768799722194672 Total Number of Iterations: 100 Total Number of Calls per Iteration: 4 Avg Execution Time (per Iteration): 0.14769ms Avg Execution Time (per Call): 0.03692ms Setting: {"options":{"model":"gpt-3.5-turbo-0613","messages":[{"role":"user","content":"Hello world"}]},"iterations":1000} Used Tokens for Each Call: 9 Total Execution Time: 96.91169968247414 Total Number of Iterations: 1000 Total Number of Calls per Iteration: 4 Avg Execution Time (per Iteration): 0.09691ms Avg Execution Time (per Call): 0.02423ms Setting: {"options":{"model":"gpt-3.5-turbo-0613","messages":[{"role":"user","content":"Hello world"}]},"iterations":10000} Used Tokens for Each Call: 9 Total Execution Time: 714.3903965950012 Total Number of Iterations: 10000 Total Number of Calls per Iteration: 4 Avg Execution Time (per Iteration): 0.07144ms Avg Execution Time (per Call): 0.01786ms Setting: {"options":{"model":"gpt-3.5-turbo-0613","messages":[{"role":"user","content":"Hello world"}]},"iterations":100000} Used Tokens for Each Call: 9 Total Execution Time: 8780.892964661121 Total Number of Iterations: 100000 Total Number of Calls per Iteration: 4 Avg Execution Time (per Iteration): 0.08781ms Avg Execution Time (per Call): 0.02195ms Setting: {"options":{"model":"gpt-3.5-turbo-0613","messages":[{"role":"user","content":"Hello world"}]},"iterations":500000} Used Tokens for Each Call: 9 Total Execution Time: 39309.42104059458 Total Number of Iterations: 500000 Total Number of Calls per Iteration: 4 Avg Execution Time (per Iteration): 0.07862ms Avg Execution Time (per Call): 0.01965ms Setting: {"options":{"model":"gpt-3.5-turbo-0613","messages":[{"role":"user","content":"Hello world"}]},"iterations":1000000} Used Tokens for Each Call: 9 Total Execution Time: 77574.36164027452 Total Number of Iterations: 1000000 Total Number of Calls per Iteration: 4 Avg Execution Time (per Iteration): 0.07757ms Avg Execution Time (per Call): 0.01939ms

End of Batch 2 >>> End of Test Result ( Thu May 16 2024 16:33:31 GMT-0400 (Eastern Daylight Time) ) >>>


- The test result seems pretty good to me, but to address this issue: https://github.com/Cainier/gpt-tokens/issues/23, I'll add more test cases in the future (eg. testing with longer messages input, comparing js-tiktoken with WASM tiktoken and measuring cold start speed). According to the test result [here](https://dev.to/maximsaplin/how-fast-is-js-tiktoken-3fmk), the wasm version tiktoken is only faster than js-tiktoken when the input size is large (923942 tokens)

Minor refactor

I've moved modelEncodingCache and getEncodingForModelCached inside GPTTokens and make them protected static, so that they are only accessible from with the class and its sub-classes. This also makes our API more encapsulated.

NPM registry error

I got the following auth error when running npm i with the latest package-lock.json

npm ERR! code E401
npm ERR! 401 Unauthorized - GET https://srun-npm.pkg.coding.net/srun4-portal/portal-core/whatwg-url/-/whatwg-url-5.0.0.tgz - Invalid credential. 请确认输入了正确的用户名和密码。

It seems that this npm resigtry requires some sort of authentication, and so I replaced them with the official registry https://registry.npmjs.org

Regression test

Add process.env.FINE_TUNE_MODEL -> const model = process.env.FINE_TUNE_MODEL || 'ft:gpt-3.5-turbo-1106:opensftp::8IWeqPit', so that developers could use their own model for testing (the original fine-tuned model listed there is somehow not accessible to me)

The test result (for my latest commit) looks good to me. The DeprecationWarning is caused by NodeJS v21, switching to v20 will solve the problem.

Testing GPT...
[1/20]: Testing gpt-3.5-turbo...
(node:725962) [DEP0040] DeprecationWarning: The `punycode` module is deprecated. Please use a userland alternative instead.
(Use `node --trace-deprecation ...` to show where the warning was created)
Pass!
[2/20]: Testing gpt-3.5-turbo-16k...
Pass!
[3/20]: Testing gpt-4...
Pass!
[4/20]: Testing gpt-4-32k...
Ignore model gpt-4-32k:
404 The model `gpt-4-32k` does not exist or you do not have access to it.
[5/20]: Testing gpt-4-turbo-preview...
Pass!
[6/20]: Testing gpt-4-turbo...
Pass!
[7/20]: Testing gpt-4o...
Pass!
[8/20]: Testing gpt-4o-2024-05-13...
Pass!
[9/20]: Testing gpt-4-turbo-2024-04-09...
Pass!
[10/20]: Testing gpt-4-0314...
Ignore model gpt-4-0314:
404 The model `gpt-4-0314` has been deprecated, learn more here: https://platform.openai.com/docs/deprecations
[11/20]: Testing gpt-4-32k-0314...
Ignore model gpt-4-32k-0314:
404 The model `gpt-4-32k-0314` has been deprecated, learn more here: https://platform.openai.com/docs/deprecations
[12/20]: Testing gpt-4-0613...
Pass!
[13/20]: Testing gpt-4-32k-0613...
Ignore model gpt-4-32k-0613:
404 The model `gpt-4-32k-0613` does not exist or you do not have access to it.
[14/20]: Testing gpt-4-1106-preview...
Pass!
[15/20]: Testing gpt-4-0125-preview...
Pass!
[16/20]: Testing gpt-3.5-turbo-0301...
Pass!
[17/20]: Testing gpt-3.5-turbo-0613...
Pass!
[18/20]: Testing gpt-3.5-turbo-16k-0613...
Pass!
[19/20]: Testing gpt-3.5-turbo-1106...
Pass!
[20/20]: Testing gpt-3.5-turbo-0125...
Pass!
Test success!
Testing function calling...
Pass!
Testing fine-tune...
Pass!
Testing Create a fine-tuned model...
Pass!
Testing performance...
Messages: [{"role":"user","content":"Hello world"}]
GPTTokens: 1.403ms
GPTTokens: 0.351ms
GPTTokens: 0.289ms
GPTTokens: 0.284ms
GPTTokens: 1.11ms
GPTTokens: 0.412ms
GPTTokens: 0.176ms
GPTTokens: 0.3ms
GPTTokens: 0.235ms
GPTTokens: 0.265ms

I suggest setting up a CI/CD pipeline for automated testing to make development and contribution easier

Cainier commented 4 months ago

Thank you very much for your contribution to this project:

  1. NPM registry error is because I mistakenly modified the global npm private package when developing another project. Thanks for finding and raising the issue, I will delete package-lock.json and re-execute npm i to generate the file

  2. Model ft:gpt-3.5-turbo-1106:opensftp::8IWeqPit only works when testing with my own accesskey. This is indeed a problem, I will replace it with environment variables as per the scheme in your submission

  3. Regarding WASM, Starting from v1.1 version, changed from @dqbd/tiktoken to js-tiktoken, because using wasm on the web requires configuration of vite/webpack https://github.com/Cainier/gpt-tokens/tree/v1.0.9

If have the need to use wasm, I will try add useWASM configuration in next version

I will try to use CI/CD pipeline for automated testing and building, But accessKey is a problem, maybe use GitHub environment variables can fix it