Open hanlaur opened 1 month ago
@hanlaur many thanks for raising the issue and identifying the underlying problem. This makes sense.
Feel free to provide a fix and we will be happy to review it soon.
Feel free to provide a fix and we will be happy to review it soon.
Thanks @sakoush for quick response! I was hoping to avoid going through the CLA steps this time, this being just one variable name change... So I created issue instead of PR. I am wondering if you would have possibility to make the PR for this?
@hanlaur hopefully we look into this in the next couple of weeks, we will also need to add testing.
Hello,
In case of adaptive batching, if I define
MLSERVER_MODEL_MAX_BATCH_TIME=1
, expecting 1 second batching, the timeout is reached much sooner than it should be. Here is example (with debug print added before adding a request to batch to print the calculatedtimeout
). The debug prints show that the timestamps of requests vs. the remaining timeout does not match. First request happens at07:27:26,260
but ML server thinks it has reached the 1 second timeout at07:27:26,386
i.e. in approx. 130ms.Issue seems to be the timeout calculation here: https://github.com/SeldonIO/MLServer/blob/1ce29e65d5cd14a24a573f2f44ae2eac2b51a0f4/mlserver/batching/adaptive.py#L139
I believe the formula should be
timeout = self._max_batch_time - (current - start)
.