awslabs / multi-model-server

Multi Model Server is a tool for serving neural net models for inference
Apache License 2.0
998 stars 230 forks source link

MemoryError handling #896

Open davidas1 opened 4 years ago

davidas1 commented 4 years ago

In this test it is mentioned that MMS expects the handler to raise MemoryError when he can no longer allocate memory for workers. I didn't find anything in the docs about the effects of this or how exactly MMS treats this error differently.

  1. Does this affect MMS behaviour / subsequent requests to register models?
  2. Is this observable from outside MMS (i.e. REST API, without parsing the server logs)? I get 507 error with general message about "Internal Server Error", so I can't separate when the error is due to OOM and when it's a failure to load the model (I want my application to behave differently in each of those cases)
vdantu commented 4 years ago

MMS just returns a 507 error for synchronous API calls, if that API caused an OOM. You would have to rely on HTTP error codes to handle the errors and not the error messages.

I didn't understand your first question fully. If there was an OOM, then the model would fail to load or request would fail, so you would need to handle this error code by reducing the number of models loaded on MMS . Otherwise you might see subsequent requests also fail.

Please let us know if this answers your question.