The latency variable is used, but can be undefined:
Initialization happens only under this condition:
if chunk == "[DONE]":
latency = time.perf_counter() - st
But it gets referred to without this condition:
output.latency = latency
So in some cases, it leads to the following error:
Traceback (most recent call last):\n File \"modular-inference-bench/src/modular_inference_benchmark/engine/backend_functions.py\", line 262, in async_request_openai_completions\n output.latency = latency\n ^^^^^^^\nUnboundLocalError: cannot access local variable 'latency' where it is not associated with a value\n
Hi,
The
latency
variable is used, but can be undefined: Initialization happens only under this condition:But it gets referred to without this condition:
So in some cases, it leads to the following error:
https://github.com/CentML/modular-inference-bench/blob/536f5b83f49dc4b39736452c1ed60ebbd74b68a3/src/modular_inference_benchmark/engine/backend_functions.py#L258