About STFS-gpu Performance

tobegit3hub / simple_tensorflow_serving

Generic and easy-to-use serving service for machine learning models

Apache License 2.0

757 stars 192 forks source link

Hi, my model is work and every is ok. thank you. But in my test case, I found some bugs.

1) Usage of GPU memory:

My model is resnet-50 and I was set session_config flag such as "log_device_placement": true, "allow_soft_placement": true, "allow_growth": true , I do not use "per_process_gpu_memory_fraction": 0.5 because it will limit 50% gpu memory use whether your model is small or big. So, when I start a serving at first time. the usage of gpu memory is 340+MB, I think it is reasonable. but when I run client code once or more. the usage of gpu memory growth untill 7.4GB. I do not know why, do you check it?

2) cost time of inference:

I test my model with frozen.pb on Session mode , the cost time is about 6-7ms of sess.run(). But when I deploy this model on STFS with gpu, it cost 40ms once! I think you have checked your STFS performace with other deploy framework. but , do you compare the cost time of Session.run() with TF-Serving run??

tobegit3hub / simple_tensorflow_serving

About STFS-gpu Performance #30