fix: running total of tokens for streams

Details

in preparation for sending gpu timing information in a proper usage struct -- getting the refactor started & fixing the current behavior of sending 0 token counts

example streaming:

...
data: {"text": " arena", "prompt_tokens": 23, "completion_tokens": 290, "usage": {"prompt_tokens": 23, "completion_tokens": 290, "total_tokens": 313}, "done": false}

data: {"text": ".", "prompt_tokens": 23, "completion_tokens": 291, "usage": {"prompt_tokens": 23, "completion_tokens": 291}, "done": false}

data: {"text": "", "prompt_tokens": 23, "completion_tokens": 292, "usage": {"prompt_tokens": 23, "completion_tokens": 292}, "done": false}

data: {"text": "", "prompt_tokens": 23, "completion_tokens": 292, "usage": {"prompt_tokens": 23, "completion_tokens": 292}, "done": true}

example non streaming:

{"text": " Project A119 was a covert research and development effort by the United States Air Force (USAF) to study, test, and evaluate anti-personnel napalm, also known as Agent Orange II. Begun around early 1963 and scheduled to run for about a year, the project's main objectives were to evaluate the potential effectiveness of Agent Orange II as an incendiary weapon and as a means to control populations during wartime emergencies, especially in scenarios involving guerrilla warfare. USAF officials, mainly under William P. Yount, conducted tests on wool and animal skins to assess the toxicity, explosive capacity, and lethality of the substance when used as a defensive tactic against enemy forces. These experiments raised concerns about its possible destructive effects on both the targeted enemy personnel and the environment, prompting criticism and leading to increased scrutiny of similar projects in the country's war efforts.", "prompt_tokens": 23, "completion_tokens": 200, "usage": {"prompt_tokens": 23, "completion_tokens": 200}, "done": true}

Code of Conduct

[x] I agree to follow this project's Code of Conduct
[x] I agree to license this contribution under the MIT LICENSE
[x] I checked the current PR for duplication.

OpenRouterTeam / openrouter-runner

fix: running total of tokens for streams #64

Details

Code of Conduct