[Bug]: Metrics incorrect when having zero throughput

Your current environment

Docker container v0.6.2 (36d2ba5ad90b)

🐛 Describe the bug

The /metrics endpoint is not showing correct usage when no messages are being sent through the system. Our Prometheus endpoint grabs this endpoint every 20 seconds, and when there is no load on the server, the first_token_output will remain at its last known variable. I expect it to reset after the metrics have been grabbed; however, it shows me an average for the pod's lifespan. Other variables (such as request latency) also suffer from this issue.

When the /metrics endpoint is grabbed, it restarts the count and does not average over the whole time the metrics have been running. This would make it much easier to spot when servers are overloaded and when HPA needs to start.

PygmalionAI / aphrodite-engine

[Bug]: Metrics incorrect when having zero throughput #782

Your current environment

🐛 Describe the bug