Open aleecoveo opened 1 year ago
Hi,
thanks for filing the issue.
I see that this can be an issue. What I am not sure about is if this is a doc bug (remove "with inference response" to generalize on response) or if there is value for changing the behavior more general.
My understanding of the timeout is recovery from failure cases in rare occasions or edge cases. Are you seeing frequently dying workers and thus set a low timeout?
Would be great to hear more about the use case!
Hello!
In fact what we were originally intending to do with the timeout doesn't work (somewhere along the line we misread the documentation, and somehow didn't realize it even while writing this issue!!! š¤¦ ); in our case the service that calls torchserve has a timeout and we didn't want the backend to waste time processing a request that the client was no longer waiting for (while fresh requests sit idle in the queue), but obviously setting the timeout won't work for that, as we don't want the worker to reboot.
For cases like this, reverse proxies use a LIFO queue for requests (e.g. Skipper); does Torchserve use LIFO or FIFO? Is there anything else you would suggest to prevent the backend from processing requests that nobody is waiting for?
For the bug itself I agree that it would probably sufficient to change the documentation.
Thanks very much!
š Describe the bug
According to the Management API docs, "response_timeout - If the modelās backend worker doesnāt respond with inference response within this timeout period, the worker will be deemed unresponsive and rebooted" but the timeout in fact seems to apply to any call, notably to the call to a backend worker to register the model, which means that if the model takes longer than
response_timeout
seconds to initialize, it cannot be served.Error logs
Here's the log you get when you register a model that takes longer to load than
response_timeout
(set to 2 in this case):Installation instructions
Torchserve run using the docker image
15279ec970c4
, which has version 0.7.0Model Packaing
Model files are here: https://gist.github.com/aleecoveo/75edbb6b9ac368e84a29dcf0fe96e439
Archived with:
torch-model-archiver --model-name long-init-model --version 0.0.1 --model-file dummy_model.py --serialized-file dummy_model.pt --handler ./handler_with_long_initialization
config.properties
No response
Versions
Repro instructions
https://gist.github.com/aleecoveo/75edbb6b9ac368e84a29dcf0fe96e439
and cd to the directorydocker run -it -p 8080:8080 -p 8081:8081 -p 8082:8082 -p 7070:7070 -p 7071:7071 -v $(pwd):/home/model-server/model-store 15279ec970c4
curl -X POST "http://localhost:8081/models?url=long-init-model.mar&initial_workers=1&response_timeout=2"
The call fails withPossible Solution
No response