tobegit3hub / simple_tensorflow_serving

Generic and easy-to-use serving service for machine learning models
https://stfs.readthedocs.io
Apache License 2.0
757 stars 193 forks source link

How to limited GPU memory? #27

Closed Johnson-yue closed 6 years ago

Johnson-yue commented 6 years ago

Hi, I update your docker image tobegit3hub/simple_tensorflow_serving:latest-gpu and test it. I found some problems:

  1. your docker default run simple_tensorflow_serving --model_config_file="./examples/model_config_file.json" but do not install ONNX, it is a small problem!

  2. tf-serving -gpu default is allocation ALL GPU Memory it was so terrible!!! look here , and luckly some have fixed it fixed code, Can you add these config and re-compile the TF-Serving. By the way , the method is _per_process_gpu_memoryfraction but I think allow_growth=True but I do not know how to do it. I think the usage of gpu depend on model mybe right?

tobegit3hub commented 6 years ago

For the first question, we don't need the onnx python package because we use onnx_python which is already merged in mxnet python package. I think you can run all the ONNX without any other problem.

We can add the parameters such as per_process_gpu_memory_fraction to control the GPU devices soon.

Johnson-yue commented 6 years ago

@tobegit3hub Oh, Mybe I have something wrong, I will check it. I will look forward to new version with parameters of Controling the GPU device

tobegit3hub commented 6 years ago

It is supported now and you can use it with pip install -U simple_tensorflow_serving>=0.6.4.

More usage refers to https://github.com/tobegit3hub/simple_tensorflow_serving#gpu-acceleration .