Closed darkacorn closed 1 month ago
Token counting is already implemented here. The completion tokens you get from the proxy include the counts of all the tokens used in calls made during the approach.
We also did a test of various techniques on the AIME benchmark recently to check which ones are more effective on a per token basis. Results for that are available here
thank you did not see the discussions are active here .. sorry for opening an issue for that
for stream and non stream .. would be amazing .. specially for keeping track of i/o cost
easy enough for non stream .. but stream is a whole other can of worms