Closed dominiccooney closed 4 months ago
could you design language for the "each file you @-tagged was ok but you tagged too many in aggregate"
@dominiccooney @chillatom would it suffice to prevent people from @'ing files if they hit too many?
One gotcha is that you can @ mention things that aren't files too. Are symbols, URLs and line ranges included in the budget?
Alternatively, you let people @ whatever they want but then change the color of the tokens to show which ones won't be included, similar to this past design concept around context limits:
design the product behavior when you @-tag files in follow-up chats, what's the policy for dropping old files? and elaborate this task list
Straw person suggestion:
@toolmantim
would it suffice to prevent people from @'ing files if they hit too many?
My $0.02: We thought about making it a token count budget, but I like this better. It is way easier to understand, error message is nice and crisp. No file is larger than A, you can have N of them, so there is a limit of A × N. LLM attention might benefit from fewer inputs regardless of the length anyway. And we can always increase it/change it later.
Are symbols, URLs and line ranges included in the budget?
My $0.02:
Definitely for line ranges because in the limit they're isomorphic to files. If we're doing the N file limit, later we could take a bunch of line ranges in the same file and say that count as 1, harmonize it that way?
URLs: I could go either way. URLs we didn't ship (IIRC) so as long as we loop back before shipping it... let's add a launch blocking task to a mini-PRD URL issue.
Does the context limit counter limited to @-mentions context only?
How does the user know if they are also hitting their input token limit? E.g., when they just copy and paste a file into the chat box
I ran a little analysis over some common repos.
A budget of 30k tokens would give us 10 files at the 90th percentile.
Total files: 20571 Mean: 1073.6000194448495, Median: 431.0, Max: 9120 90th Percentile: 3081.0
Repos checked
https://github.com/sourcegraph/cody.git, https://github.com/facebook/react.git, https://github.com/django/django.git, https://github.com/rust-lang/rust.git, https://github.com/golang/go.git, https://github.com/apache/kafka.git, https://github.com/google/leveldb.git
We let people @ things using the same token limits for follow up messages Any past @ things will be excluded from context if they can't fit into the budget, evicting the oldest first
@toolmantim I like this. So we would have 30k token budget (per analysis above)
First Message: User mentions 3 files of 3k each, eating up 9k / 30k token budget
Follow up 1: User mentions 4 more files of 5k each, eating up 20k more tokens. We keep the 9k tokens worth of chat messages previously mentioned for a total of 29k/30k used.
Follow up 2: User mentions 2 more files of 4k each, eating up 8k tokens. We now remove the 3 oldest files (freeing up 9k tokens) to make space for the 2 new files and 8k tokens they consume.
30k token
@chillatom is this 30k token equal to 30k * 4byte per token = 120k bytes?
Update: confirmed this is referring to token count and not bytes
User @ mention budget - this should be large enough for a user to @ mention 5 files that are 85th percentile in size for the average codebase, e.g. 5 x 1000 LOC = 5,000 LOC x 7.5 tokens / line = 38k tokens
Okay to have a per file limit, so that we don't allow a single 38k token file to be input. e.g. no file exceeds 7k tokens, but you can add as many files as you want up to 38k tokens.
This extra budget applies specifically to @-mentioned user context files. For example, for chat, DefaultPrompter needs new logic to apply this special budget.