Add performance test and minor code refactor

First of all, I would like to thank the author for this project❤️ , which really makes my life a lot easier. And I also learnt a lot from the coding style and structure

Here are the changes I made:

Performance Test

I've created a new performance testing suite under tests/perf.js, which can be run with npm run test:perf
The reason for this is that I plan to integrate the tiktoken WASM binding into this project in the future. And before that, I would like to setup more fine-grained performance tests, so that we can test the performance difference between js-tiktoken and tiktoken
Inspired by testPerformance(messages) in test.js, I set up tests in tests/perf.js that measures the execution time for each of the following operations separately: usageInfo.usedTokens, usageInfo.promptUsedTokens, usageInfo.completionUsedTokens and usageInfo.usedUSD.
To get rid of the impact of cold start, aGPTTokens instance is always instantiated and perform usageInfo.usedTokens to cache the encoding in modelEncodingCache before the timing starts. So the test results might look different from the original testPerformance(messages)
To get a more consistent test result, the average execution time of 10, 100, 1000, ... , 1 million iterations are computed, which always converges to around 0.019ms per calls (by 'calls' I mean ops like usageInfo.usedTokens, usageInfo.promptUsedTokens, etc ) on my system (Ubuntu 22.04.3, 5.15.146.1-microsoft-standard-WSL2, x86_64, Intel(R) Core(TM) i5-7300HQ CPU @ 2.50GHz), with NodeJS v21.2.0

Here is the latest test result (time unit: ms)


>>> Start of Test Result >>>
>>> Start of Batch 1 >>>
Testing performance...
Options: {"model":"gpt-3.5-turbo-0613","messages":[{"role":"user","content":"Hello world"}]}, iterations: 20

-> usedTokens time: 0.1168 -> promptUsedTokens time: 0.3037 -> completionUsedTokens time: 0.1459 -> usedUSD time: 1.7766 Total time: 2.343

-> usedTokens time: 0.1139 -> promptUsedTokens time: 0.1359 -> completionUsedTokens time: 0.0042 -> usedUSD time: 0.2422 Total time: 0.4962

-> usedTokens time: 0.0684 -> promptUsedTokens time: 0.0445 -> completionUsedTokens time: 0.0027 -> usedUSD time: 0.3555 Total time: 0.4711

-> usedTokens time: 0.0309 -> promptUsedTokens time: 0.0401 -> completionUsedTokens time: 0.0016 -> usedUSD time: 0.1136 Total time: 0.1862

-> usedTokens time: 0.0555 -> promptUsedTokens time: 0.0387 -> completionUsedTokens time: 0.0023 -> usedUSD time: 0.2473 Total time: 0.3438

-> usedTokens time: 0.0297 -> promptUsedTokens time: 0.0399 -> completionUsedTokens time: 0.0022 -> usedUSD time: 0.1403 Total time: 0.2121

-> usedTokens time: 0.0276 -> promptUsedTokens time: 0.0391 -> completionUsedTokens time: 0.0023 -> usedUSD time: 0.2982 Total time: 0.3672

-> usedTokens time: 0.0424 -> promptUsedTokens time: 0.0535 -> completionUsedTokens time: 0.0024 -> usedUSD time: 0.0955 Total time: 0.1938

-> usedTokens time: 0.0288 -> promptUsedTokens time: 0.0421 -> completionUsedTokens time: 0.0049 -> usedUSD time: 0.1504 Total time: 0.2262

-> usedTokens time: 0.0308 -> promptUsedTokens time: 0.0435 -> completionUsedTokens time: 0.0062 -> usedUSD time: 0.1025 Total time: 0.183

-> usedTokens time: 0.0274 -> promptUsedTokens time: 0.0395 -> completionUsedTokens time: 0.0018 -> usedUSD time: 0.0775 Total time: 0.1462

-> usedTokens time: 0.0263 -> promptUsedTokens time: 0.0389 -> completionUsedTokens time: 0.0019 -> usedUSD time: 0.0803 Total time: 0.1474

-> usedTokens time: 0.027 -> promptUsedTokens time: 0.0391 -> completionUsedTokens time: 0.0018 -> usedUSD time: 0.0952 Total time: 0.1631

-> usedTokens time: 0.0266 -> promptUsedTokens time: 0.0383 -> completionUsedTokens time: 0.0021 -> usedUSD time: 0.0749 Total time: 0.1419

-> usedTokens time: 0.0269 -> promptUsedTokens time: 0.0386 -> completionUsedTokens time: 0.0019 -> usedUSD time: 0.0752 Total time: 0.1426

-> usedTokens time: 0.0271 -> promptUsedTokens time: 0.0393 -> completionUsedTokens time: 0.0018 -> usedUSD time: 8.2488 Total time: 8.317

-> usedTokens time: 0.0362 -> promptUsedTokens time: 0.0972 -> completionUsedTokens time: 0.0649 -> usedUSD time: 0.461 Total time: 0.6593

-> usedTokens time: 0.0333 -> promptUsedTokens time: 0.0798 -> completionUsedTokens time: 0.002 -> usedUSD time: 0.1261 Total time: 0.2412

-> usedTokens time: 0.0322 -> promptUsedTokens time: 0.0702 -> completionUsedTokens time: 0.0019 -> usedUSD time: 0.1382 Total time: 0.2425

-> usedTokens time: 0.0373 -> promptUsedTokens time: 0.0605 -> completionUsedTokens time: 0.0013 -> usedUSD time: 0.0901 Total time: 0.1892

End of Batch 1 >>> Start of Batch 2 >>> Testing performance... Options: {"model":"gpt-3.5-turbo-0613","messages":[{"role":"user","content":"Hello world"}]}, iterations: 10

Testing performance... Options: {"model":"gpt-3.5-turbo-0613","messages":[{"role":"user","content":"Hello world"}]}, iterations: 100

Testing performance... Options: {"model":"gpt-3.5-turbo-0613","messages":[{"role":"user","content":"Hello world"}]}, iterations: 1000

Testing performance... Options: {"model":"gpt-3.5-turbo-0613","messages":[{"role":"user","content":"Hello world"}]}, iterations: 10000

Testing performance... Options: {"model":"gpt-3.5-turbo-0613","messages":[{"role":"user","content":"Hello world"}]}, iterations: 100000

Testing performance... Options: {"model":"gpt-3.5-turbo-0613","messages":[{"role":"user","content":"Hello world"}]}, iterations: 500000

Testing performance... Options: {"model":"gpt-3.5-turbo-0613","messages":[{"role":"user","content":"Hello world"}]}, iterations: 1000000

Statistical Information: Setting: {"options":{"model":"gpt-3.5-turbo-0613","messages":[{"role":"user","content":"Hello world"}]},"iterations":10} Used Tokens for Each Call: 9 Total Execution Time: 1.5428999960422516 Total Number of Iterations: 10 Total Number of Calls per Iteration: 4 Avg Execution Time (per Iteration): 0.15429ms Avg Execution Time (per Call): 0.03857ms Setting: {"options":{"model":"gpt-3.5-turbo-0613","messages":[{"role":"user","content":"Hello world"}]},"iterations":100} Used Tokens for Each Call: 9 Total Execution Time: 14.768799722194672 Total Number of Iterations: 100 Total Number of Calls per Iteration: 4 Avg Execution Time (per Iteration): 0.14769ms Avg Execution Time (per Call): 0.03692ms Setting: {"options":{"model":"gpt-3.5-turbo-0613","messages":[{"role":"user","content":"Hello world"}]},"iterations":1000} Used Tokens for Each Call: 9 Total Execution Time: 96.91169968247414 Total Number of Iterations: 1000 Total Number of Calls per Iteration: 4 Avg Execution Time (per Iteration): 0.09691ms Avg Execution Time (per Call): 0.02423ms Setting: {"options":{"model":"gpt-3.5-turbo-0613","messages":[{"role":"user","content":"Hello world"}]},"iterations":10000} Used Tokens for Each Call: 9 Total Execution Time: 714.3903965950012 Total Number of Iterations: 10000 Total Number of Calls per Iteration: 4 Avg Execution Time (per Iteration): 0.07144ms Avg Execution Time (per Call): 0.01786ms Setting: {"options":{"model":"gpt-3.5-turbo-0613","messages":[{"role":"user","content":"Hello world"}]},"iterations":100000} Used Tokens for Each Call: 9 Total Execution Time: 8780.892964661121 Total Number of Iterations: 100000 Total Number of Calls per Iteration: 4 Avg Execution Time (per Iteration): 0.08781ms Avg Execution Time (per Call): 0.02195ms Setting: {"options":{"model":"gpt-3.5-turbo-0613","messages":[{"role":"user","content":"Hello world"}]},"iterations":500000} Used Tokens for Each Call: 9 Total Execution Time: 39309.42104059458 Total Number of Iterations: 500000 Total Number of Calls per Iteration: 4 Avg Execution Time (per Iteration): 0.07862ms Avg Execution Time (per Call): 0.01965ms Setting: {"options":{"model":"gpt-3.5-turbo-0613","messages":[{"role":"user","content":"Hello world"}]},"iterations":1000000} Used Tokens for Each Call: 9 Total Execution Time: 77574.36164027452 Total Number of Iterations: 1000000 Total Number of Calls per Iteration: 4 Avg Execution Time (per Iteration): 0.07757ms Avg Execution Time (per Call): 0.01939ms

End of Batch 2 >>> End of Test Result ( Thu May 16 2024 16:33:31 GMT-0400 (Eastern Daylight Time) ) >>>


- The test result seems pretty good to me, but to address this issue: https://github.com/Cainier/gpt-tokens/issues/23, I'll add more test cases in the future (eg. testing with longer messages input, comparing js-tiktoken with WASM tiktoken and measuring cold start speed). According to the test result [here](https://dev.to/maximsaplin/how-fast-is-js-tiktoken-3fmk), the wasm version tiktoken is only faster than js-tiktoken when the input size is large (923942 tokens)

Minor refactor

I've moved modelEncodingCache and getEncodingForModelCached inside GPTTokens and make them protected static, so that they are only accessible from with the class and its sub-classes. This also makes our API more encapsulated.

NPM registry error

I got the following auth error when running npm i with the latest package-lock.json

npm ERR! code E401
npm ERR! 401 Unauthorized - GET https://srun-npm.pkg.coding.net/srun4-portal/portal-core/whatwg-url/-/whatwg-url-5.0.0.tgz - Invalid credential. 请确认输入了正确的用户名和密码。

It seems that this npm resigtry requires some sort of authentication, and so I replaced them with the official registry https://registry.npmjs.org

Regression test

Add process.env.FINE_TUNE_MODEL -> const model = process.env.FINE_TUNE_MODEL || 'ft:gpt-3.5-turbo-1106:opensftp::8IWeqPit', so that developers could use their own model for testing (the original fine-tuned model listed there is somehow not accessible to me)

The test result (for my latest commit) looks good to me. The DeprecationWarning is caused by NodeJS v21, switching to v20 will solve the problem.

Testing GPT...
[1/20]: Testing gpt-3.5-turbo...
(node:725962) [DEP0040] DeprecationWarning: The `punycode` module is deprecated. Please use a userland alternative instead.
(Use `node --trace-deprecation ...` to show where the warning was created)
Pass!
[2/20]: Testing gpt-3.5-turbo-16k...
Pass!
[3/20]: Testing gpt-4...
Pass!
[4/20]: Testing gpt-4-32k...
Ignore model gpt-4-32k:
404 The model `gpt-4-32k` does not exist or you do not have access to it.
[5/20]: Testing gpt-4-turbo-preview...
Pass!
[6/20]: Testing gpt-4-turbo...
Pass!
[7/20]: Testing gpt-4o...
Pass!
[8/20]: Testing gpt-4o-2024-05-13...
Pass!
[9/20]: Testing gpt-4-turbo-2024-04-09...
Pass!
[10/20]: Testing gpt-4-0314...
Ignore model gpt-4-0314:
404 The model `gpt-4-0314` has been deprecated, learn more here: https://platform.openai.com/docs/deprecations
[11/20]: Testing gpt-4-32k-0314...
Ignore model gpt-4-32k-0314:
404 The model `gpt-4-32k-0314` has been deprecated, learn more here: https://platform.openai.com/docs/deprecations
[12/20]: Testing gpt-4-0613...
Pass!
[13/20]: Testing gpt-4-32k-0613...
Ignore model gpt-4-32k-0613:
404 The model `gpt-4-32k-0613` does not exist or you do not have access to it.
[14/20]: Testing gpt-4-1106-preview...
Pass!
[15/20]: Testing gpt-4-0125-preview...
Pass!
[16/20]: Testing gpt-3.5-turbo-0301...
Pass!
[17/20]: Testing gpt-3.5-turbo-0613...
Pass!
[18/20]: Testing gpt-3.5-turbo-16k-0613...
Pass!
[19/20]: Testing gpt-3.5-turbo-1106...
Pass!
[20/20]: Testing gpt-3.5-turbo-0125...
Pass!
Test success!
Testing function calling...
Pass!
Testing fine-tune...
Pass!
Testing Create a fine-tuned model...
Pass!
Testing performance...
Messages: [{"role":"user","content":"Hello world"}]
GPTTokens: 1.403ms
GPTTokens: 0.351ms
GPTTokens: 0.289ms
GPTTokens: 0.284ms
GPTTokens: 1.11ms
GPTTokens: 0.412ms
GPTTokens: 0.176ms
GPTTokens: 0.3ms
GPTTokens: 0.235ms
GPTTokens: 0.265ms

I suggest setting up a CI/CD pipeline for automated testing to make development and contribution easier

Cainier / gpt-tokens