OpenRouterTeam / openrouter-runner

Inference engine powering open source models on OpenRouter
https://openrouter.ai
MIT License
517 stars 49 forks source link

fix: running total of tokens for streams #64

Closed sambarnes closed 7 months ago

sambarnes commented 7 months ago

Details

in preparation for sending gpu timing information in a proper usage struct -- getting the refactor started & fixing the current behavior of sending 0 token counts

example streaming:

...
data: {"text": " arena", "prompt_tokens": 23, "completion_tokens": 290, "usage": {"prompt_tokens": 23, "completion_tokens": 290, "total_tokens": 313}, "done": false}

data: {"text": ".", "prompt_tokens": 23, "completion_tokens": 291, "usage": {"prompt_tokens": 23, "completion_tokens": 291}, "done": false}

data: {"text": "", "prompt_tokens": 23, "completion_tokens": 292, "usage": {"prompt_tokens": 23, "completion_tokens": 292}, "done": false}

data: {"text": "", "prompt_tokens": 23, "completion_tokens": 292, "usage": {"prompt_tokens": 23, "completion_tokens": 292}, "done": true}

example non streaming:

{"text": " Project A119 was a covert research and development effort by the United States Air Force (USAF) to study, test, and evaluate anti-personnel napalm, also known as Agent Orange II. Begun around early 1963 and scheduled to run for about a year, the project's main objectives were to evaluate the potential effectiveness of Agent Orange II as an incendiary weapon and as a means to control populations during wartime emergencies, especially in scenarios involving guerrilla warfare. USAF officials, mainly under William P. Yount, conducted tests on wool and animal skins to assess the toxicity, explosive capacity, and lethality of the substance when used as a defensive tactic against enemy forces. These experiments raised concerns about its possible destructive effects on both the targeted enemy personnel and the environment, prompting criticism and leading to increased scrutiny of similar projects in the country's war efforts.", "prompt_tokens": 23, "completion_tokens": 200, "usage": {"prompt_tokens": 23, "completion_tokens": 200}, "done": true}

Code of Conduct

sambarnes commented 7 months ago

looking good on dev, going to merge and deploy & move onto testing the private router