Support `stream=True` without buffering up the full response

yagil / tokmon

CLI to monitor your program's OpenAI API token usage.

Apache License 2.0

57 stars 6 forks source link

Support `stream=True` without buffering up the full response #4

Open yagil opened 1 year ago

yagil commented 1 year ago

If the monitored program makes use of OpenAI response streaming (with SSE), incoming chunks gets buffered until the [DONE] message. This alters the behavior of the monitored program and is undesirable.

Related code: see commented out block in tokmon.py around line 34.