qaeu / wealth-app

https://qaeu.vercel.app
0 stars 1 forks source link

Optimise tokenisation when querying ChatGPT #1

Open qaeu opened 4 months ago

qaeu commented 4 months ago
Full server action length: 1250
Tokenising length: 293
Full server action length: 1211
Tokenising length: 278
Full action length: 1537
Tokenising length: 232
Full action length: 1460
Tokenising length: 347
Full action length: 1176
Tokenising length: 342

Measured in ms, for queryChatGPT() and setPreviousTokens().

setPreviousTokens() calls and awaits js-tiktoken functions sequentially.

Testing using Promise.all() to resolve both function calls asynchronously increases duration by ~50%:

Full server action length: 1544
Tokenising length: 504
Full server action length: 1620
Tokenising length: 478
Full server action length: 1024
Tokenising length: 424

Should test using the wasm tiktoken package.

qaeu commented 4 months ago

Manually preprocessing IGNORE_STRINGS into tokens to remove the 2nd encode call decreases duration by ~50%:

Full server action length: 1261
Tokenising length: 112
Full server action length: 884
Tokenising length: 134
Full server action length: 1056
Tokenising length: 183
Full server action length: 1047
Tokenising length: 131

Leaving as-is for now for ease of modification.

Edit: should consider preprocessing at build time.