awslabs / multi-model-server

Multi Model Server is a tool for serving neural net models for inference
Apache License 2.0
998 stars 230 forks source link

config MMS to process the newest request only #902

Open carter54 opened 4 years ago

carter54 commented 4 years ago

Hello~ Is it possible to config MMS, let it only process the newest request, and stop all the previous requests if they have not finished?

vdantu commented 4 years ago

@carter54 : If you want to stop model workers from processing the requests, you can set the response timeout to a lower value. This will stop the processing of any request which takes longer than the "response timeout".

The request queue in MMS ensures that older messages are processed first (FIFO). There is currently no mechanism to flush this queue. So, if you are looking to delete messages from request-queue, I don't think this is possible today.

carter54 commented 4 years ago

@vdantu I see, thanks for the reply. If I have to delete unfinished request in the queue, do you have any recommendation? Thx!

vdantu commented 4 years ago

I can't think of any good way to do this as of now.