Segmentation fault when run TF Serving with NVIDIA MPS

qianlin404 commented 5 years ago

System information

OS Platform and Distribution: Linux Ubuntu 16.04
TensorFlow Serving installed from: Binary Docker
TensorFlow Serving version: tensorflow/serving:latest-gpu

Describe the problem

I am experimenting with Tensorflow Serving GPU and NVIDIA Multi-Process Server. I run TF Serving with Docker tensorflow/serving:latest-gpu. Everything works properly when MPS is disable. However, when I enable MPS using sudo nvidia-cuda-mps-control -d and run TF Serving, I got the following error:

2019-04-20 20:19:22.916586: I tensorflow_serving/model_servers/server.cc:82] Building single TensorFlow model file config:  model_name: mobilenet_v2_ssd model_base_path: /models/mobilenet_v2_ssd
2019-04-20 20:19:22.916786: I tensorflow_serving/model_servers/server_core.cc:461] Adding/updating models.
2019-04-20 20:19:22.916810: I tensorflow_serving/model_servers/server_core.cc:558]  (Re-)adding model: mobilenet_v2_ssd
2019-04-20 20:19:23.017200: I tensorflow_serving/core/basic_manager.cc:739] Successfully reserved resources to load servable {name: mobilenet_v2_ssd version: 1}
2019-04-20 20:19:23.017247: I tensorflow_serving/core/loader_harness.cc:66] Approving load for servable version {name: mobilenet_v2_ssd version: 1}
2019-04-20 20:19:23.017260: I tensorflow_serving/core/loader_harness.cc:74] Loading servable version {name: mobilenet_v2_ssd version: 1}
2019-04-20 20:19:23.017297: I external/org_tensorflow/tensorflow/contrib/session_bundle/bundle_shim.cc:363] Attempting to load native SavedModelBundle in bundle-shim from: /models/mobilenet_v2_ssd/0000001
2019-04-20 20:19:23.017324: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:31] Reading SavedModel from: /models/mobilenet_v2_ssd/0000001
2019-04-20 20:19:23.085294: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:54] Reading meta graph with tags { serve }
2019-04-20 20:19:23.120551: I external/org_tensorflow/tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
/usr/bin/tf_serving_entrypoint.sh: line 3:     6 Segmentation fault      (core dumped) tensorflow_model_server --port=8500 --rest_api_port=8501 --model_name=${MODEL_NAME} --model_base_path=${MODEL_BASE_PATH}/${MODEL_NAME} "$@"

Exact Steps to Reproduce

I run my experiment in AWS Deep Learning AMI (Ubuntu) Version 22.0 with instance type p3.2xlarge. GPU information is as follow:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.104      Driver Version: 410.104      CUDA Version: 10.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla V100-SXM2...  On   | 00000000:00:1E.0 Off |                    0 |
| N/A   43C    P0    29W / 300W |      0MiB / 16130MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Source code / logs

The code I use to run TF Serving is as follow:

docker run -t --rm -p 8500:8500 -p 8501:8501 --runtime=nvidia \
        -v "$model_path:/models/$model_name" \
        -e MODEL_NAME=$model_name \
        tensorflow/serving:latest-gpu &

The code I use to enable MPS is:

sudo nvidia-cuda-mps-control -d

gowthamkpr commented 5 years ago

@qianlin404 Can you please take a look at this issue and let me know if that answers your question. Thanks!

tomandjerrygit commented 5 years ago

hello i try to run serving in mps and refer to [(https://github.com/NVIDIA/nvidia-docker/issues/419)] you should first install nvidia-docker2 and follow steps in this issue

qianlin404 commented 5 years ago

Hi @tomandjerrygit , thanks for referring. It sounds like this is an IPC problem. The MPS is running on host and the program running inside the docker container cannot communicate with it. By setting --ipc=host, it works properly.

Master-Ju commented 4 months ago

new b

tensorflow / serving