max_request_size not working

johncolby commented 5 years ago

Hello, I am trying to deploy a 3D segmentation model that operates on ~9MB binary data inputs, so I appreciate the recent option to increase max_request_size in cb174b08a3a96b60f3fff0a5eb80fd1b86fe381d. I updated my mms installation to 1.0.2, added max_request_size=10000000 to config.properties, and confirmed this setting is recognized in the mms startup log. However, I am still getting the following error trace upon POSTing an inference request:

2019-03-13 16:56:46,043 [INFO ] W-9000-unet-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle -     request = _retrieve_request(conn)
2019-03-13 16:56:46,046 [INFO ] W-9000-unet-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle -   File "/home/jcolby/.conda/envs/mms/lib/python3.7/site-packages/mms/protocol/otf_message_handler.py", line 221, in _retrieve_request
2019-03-13 16:56:46,046 [INFO ] W-9000-unet-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle -     input_data = _retrieve_input_data(conn)
2019-03-13 16:56:46,046 [INFO ] W-9000-unet-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle -   File "/home/jcolby/.conda/envs/mms/lib/python3.7/site-packages/mms/protocol/otf_message_handler.py", line 271, in _retrieve_input_data
2019-03-13 16:56:46,047 [INFO ] W-9000-unet-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle -     value = _retrieve_buffer(conn, length)
2019-03-13 16:56:46,047 [INFO ] W-9000-unet-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle -   File "/home/jcolby/.conda/envs/mms/lib/python3.7/site-packages/mms/protocol/otf_message_handler.py", line 127, in _retrieve_buffer
2019-03-13 16:56:46,047 [INFO ] W-9000-unet-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle -     raise ValueError("Exceed max buffer size: {}".format(length))
2019-03-13 16:56:46,047 [INFO ] W-9000-unet-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - ValueError: Exceed max buffer size: 9210483
2019-03-13 16:56:46,058 [INFO ] W-9000-unet ACCESS_LOG - /127.0.0.1:57122 "POST /predictions/unet HTTP/1.1" 500 117

It looks like it may be still hardcoded in one spot?

https://github.com/awslabs/mxnet-model-server/blob/09c7665d5991727d55b411017257f8d716359448/mms/protocol/otf_message_handler.py#L21

https://github.com/awslabs/mxnet-model-server/blob/09c7665d5991727d55b411017257f8d716359448/mms/protocol/otf_message_handler.py#L125-L129

Thanks for looking into this. -John

vdantu commented 5 years ago

@johncolby : Thanks a ton for finding this. I can raise a PR for this.

johncolby commented 5 years ago

Awesome, thanks so much for all the great work!

awslabs / multi-model-server

max_request_size not working #771