aws / sagemaker-inference-toolkit

Serve machine learning models within a 🐳 Docker container using 🧠 Amazon SageMaker.
Apache License 2.0
370 stars 82 forks source link

Server Timeout Unit is Minutes in MMS, but docstring says Seconds #99

Closed kastman closed 1 month ago

kastman commented 2 years ago

Describe the bug The model server timeout ("used for model server's backend workers before they are deemed unresponsive and rebooted") currently set in with env vars using SAGEMAKER_MODEL_SERVER_TIMEOUT is listed in seconds in the property method description docstring...

def model_server_timeout(self):  # type: () -> int
    """int: Timeout, in **seconds**, used for model server's backend workers before
    they are deemed unresponsive and rebooted.
    """
    return self._model_server_timeout

...but the actual unit used downstream in multi-model server worker manager is minutes, not seconds.

// TODO: Change this to configurable param
ModelWorkerResponse reply = replies.poll(responseTimeout, TimeUnit.MINUTES);

Because of this, the default timeout of 20 in inference toolkit is actually a 20 minute timeout, not a 20 second timeout.

It seems odd that the unit is minutes, and because this is a parsed as an int in inference-toolkit argparse it does only give a resolution of whole minutes (instead of say, .33 minutes for a 20s equivalent timeout), so should I report this downstream in multi-model-server? If you don't want to change it, we should at least fix the docstring in inference-toolkit.

lorenzwalthert commented 1 month ago

Was this closed with https://github.com/aws/sagemaker-inference-toolkit/pull/129?

kastman commented 1 month ago

Yes, looks like the Seconds variable fixes this; thanks for pointing it out!