Closed cehrett closed 3 months ago
To support this, we need a script that can take a prompt and a model, and output the number of tokens that the prompt constitutes. (Then, using this script, we can throw a warning or exception if the size of the prompt is sufficiently high.)
Added functions to count tokens and partition messages (assuming input is markdown table); starts subprocessing at 63000 tokens, but can be changed if necessary. Throws runtime error if the prompt not being partitioned is 75% or more of the max tokens to avoid unnecessary API calls.
Sometimes, there are so many frame-clusters present in a day that the collapsing and description scripts cannot fit them all in a single prompt. So, these scripts should detect when the prompt context window is too large, and if it is, they should perform the collapsing/description in multiple stages on sub-portions of the data.