OpenAutoCoder / Agentless

Agentless🐱: an agentless approach to automatically solve software development problems
MIT License
682 stars 80 forks source link

Tokens #13

Open Hodge931 opened 2 months ago

Hodge931 commented 2 months ago

In the paper, it is reported that the average consumed token is 42,376 per example. Table 2 in the paper says that, 246 lines of code are left after figuring out the edit locations, and thus are processed during the repair phase. If there are 5 tokens on average for each line, and 42 patches are generated per bug, 246542=51,660 tokens will be included in the prompt during the repair. There will be more to further include the generated tokens and the token consumption during localization. Am I missing something? Thanks a lot!

brutalsavage commented 2 months ago

Hi, when we sample multiple patches per bug, we do not need to consume the input tokens (i.e., prompt tokens) multiple times, we only need to generate the patch (i.e., completion tokens). You can try it with the OpenAI API with multiple samples (you will observe that the cost and token number increases linearly with number of completion tokens not the number of prompt tokens -- which is only consumed once in the same set of samples)