awslabs / multi-model-server

Multi Model Server is a tool for serving neural net models for inference
Apache License 2.0
984 stars 230 forks source link

Add config for request timeout in seconds #1019

Closed davidthomas426 closed 1 year ago

davidthomas426 commented 1 year ago

Before or while filing an issue please feel free to join our slack channel to get in touch with development team, ask questions, find out what's cooking and more!

Issue #, if available:

Description of changes:

Response timeout is documented to be in seconds. However, in the actual code, it's interpreted in minutes.

By the way, TorchServe has the exact same config variable, but it's interpreted in seconds.

For backward compatibility reasons, it feels risky to just change it, since it could theoretically break a bunch of models currently working in production. So, for now, this PR changes the internal logic to process the timeout in seconds, and uses "seconds" in the variable name for all related variables that represent the timeout in seconds. It then adds an extra config with a "_SECONDS" prefix, to give users a way to configure a timeout in seconds (with less than a minute).

We'll probably just change it to work as documented later. But at least this unblocks customers that want a timeout that's less then one minute, which the code before this PR doesn't allow for that at all.

Testing done:

Added a new unit test and verified that the code builds and the unit tests pass.

To run CI tests on your changes refer README.md

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.