Closed Mark-Simulacrum closed 9 months ago
@bors r+
:pushpin: Commit bac1249df039b4f0030c391641d87a117cb027a3 has been approved by Mark-Simulacrum
It is now in the queue for this repository.
:hourglass: Testing commit bac1249df039b4f0030c391641d87a117cb027a3 with merge 4d35849b3abd953dc92ecf1c793d5e83b181338b...
:sunny: Test successful - checks-actions Approved by: Mark-Simulacrum Pushing 4d35849b3abd953dc92ecf1c793d5e83b181338b to master...
Previously we were only tracking the worker time, not the endpoint. We see that there is a direct correlation with the throughput of a job and the worker time. This seems wrong to me, because as long as the worker is keeping up with the input rate, the throughput shouldn't be affected.
Note that we believe that the worker should not affect the HTTP endpoint at all - we connect these with a bounded queue and pushing into the queue is done with
try_send
, which shouldn't block (https://docs.rs/crossbeam-channel/latest/crossbeam_channel/struct.Sender.html#method.try_send) and returns an error if the queue is full. We already emit a metric if the queue is full, and that's not happening here.The hope is that the extra metric here gives us some clue for what the problem is.
Metric graphs: