Creating a new Tensorflow Device everythime I run the model

I'm trying to run tensorflow-deeplab-v3 model on a server to segmentate images that I send. Everything works fine but the problem is every time I send an image the model looks for GPU and creates a new GPU device and this process of device creation costs around 10 seconds for each image that I send. How can I prevent the model from creating device every time and just use the previously created one?

I'm running my server on an Amazon p2.xlarge EC2 instance. The OS info is:

Distributor ID: Ubuntu
Description:    Ubuntu 16.04.6 LTS
Release:    16.04
Codename:   xenial

nvidia-smi output:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.40.04    Driver Version: 418.40.04    CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla K80           On   | 00000000:00:1E.0 Off |                    0 |
| N/A   35C    P8    28W / 149W |      0MiB / 11441MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

nvcc --version output:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130

python version: 3.5.2 pip version: 19.1.1 pip list output:

Package              Version        
-------------------- ---------------
absl-py              0.7.1          
astor                0.8.0          
bottle               0.12.16        
certifi              2019.3.9       
chardet              3.0.4          
cycler               0.10.0         
gast                 0.2.2          
get                  2019.4.13      
google-pasta         0.1.7          
grpcio               1.21.1         
h5py                 2.9.0          
idna                 2.8            
Keras-Applications   1.0.8          
Keras-Preprocessing  1.1.0          
kiwisolver           1.1.0          
Markdown             3.1.1          
matplotlib           3.0.3          
mock                 3.0.5          
numpy                1.16.4         
opencv-python        4.1.0.25       
Pillow               6.0.0          
pip                  19.1.1         
post                 2019.4.13      
protobuf             3.8.0          
public               2019.4.13      
pyparsing            2.4.0          
python-dateutil      2.8.0          
query-string         2019.4.13      
request              2019.4.13      
requests             2.22.0         
setuptools           41.0.1         
six                  1.12.0         
tb-nightly           1.14.0a20190614
tensorboard          1.14.0         
tensorflow-estimator 1.14.0         
tensorflow-gpu       1.14.0         
termcolor            1.1.0          
urllib3              1.25.3         
Werkzeug             0.15.4         
wheel                0.33.4         
wrapt                1.11.2

Output for the first request sent to server (numbers in the () are completion times in seconds):

(Cabin) ubuntu@ip-172-31-18-152:~/Cabin$ CUDA_VISIBLE_DEVICES=0 python Cabin.py --private_ip 172.31.18.152
Searching for gpus...
2019-06-23 11:17:35.611990: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcuda.so.1
2019-06-23 11:17:35.681815: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-06-23 11:17:35.682581: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties: 
name: Tesla K80 major: 3 minor: 7 memoryClockRate(GHz): 0.8235
pciBusID: 0000:00:1e.0
2019-06-23 11:17:35.685889: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2019-06-23 11:17:35.747229: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
2019-06-23 11:17:35.778084: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10.0
2019-06-23 11:17:35.787495: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10.0
2019-06-23 11:17:35.856472: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10.0
2019-06-23 11:17:35.898971: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10.0
2019-06-23 11:17:36.013921: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2019-06-23 11:17:36.014076: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-06-23 11:17:36.014873: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-06-23 11:17:36.015586: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
Found all gpus. (0.40801453590393066)
Generating model...
Model ready. (0.0017066001892089844)
Bottle v0.12.16 server starting up (using WSGIRefServer())...
Listening on http://172.31.18.152:8080/
Hit Ctrl-C to quit.

Request arrived.
Downloading images...
Download complete. (0.23528265953063965)
Preparing images...
Images ready. (0.013093709945678711)
Saving images...
Images saved. (0.09435057640075684)
Evaluating model...
Preparing list...
List generated (0.00017762184143066406)
Loading images...
WARNING: Logging before flag parsing goes to stderr.
W0623 11:17:57.318472 140174189094656 deprecation_wrapper.py:119] From /home/ubuntu/Cabin/DeepLab/tensorflow_deeplab_v3_plus/utils/dataset_util.py:60: The name tf.gfile.GFile is deprecated. Please use tf.io.gfile.GFile instead.

Images loaded (0.0007865428924560547)
Inside device
Predicting...
Predictions completed. (5.245208740234375e-06)
Calling zip function...
Zip() complete. (1.6689300537109375e-06)
Zipped: <zip object at 0x7f7c70280cc8>
Writing output masks...
W0623 11:17:57.343004 140174189094656 deprecation_wrapper.py:119] From /home/ubuntu/Cabin/DeepLab/tensorflow_deeplab_v3_plus/utils/preprocessing.py:232: The name tf.read_file is deprecated. Please use tf.io.read_file instead.

W0623 11:17:57.421230 140174189094656 deprecation.py:323] From /home/ubuntu/Cabin/DeepLab/tensorflow_deeplab_v3_plus/utils/preprocessing.py:234: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.cast` instead.
W0623 11:17:57.440225 140174189094656 deprecation.py:323] From /home/ubuntu/Cabin/DeepLab/tensorflow_deeplab_v3_plus/utils/preprocessing.py:261: DatasetV1.make_one_shot_iterator (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `for ... in dataset:` to iterate over a dataset. If using `tf.estimator`, return the `Dataset` object directly from your input function. As a last resort, you can use `tf.compat.v1.data.make_one_shot_iterator(dataset)`.
W0623 11:18:02.673879 140174189094656 deprecation_wrapper.py:119] From /home/ubuntu/Cabin/DeepLab/tensorflow_deeplab_v3_plus/deeplab_model.py:35: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead.

W0623 11:18:03.029822 140174189094656 deprecation_wrapper.py:119] From /home/ubuntu/Cabin/DeepLab/tensorflow_deeplab_v3_plus/deeplab_model.py:60: The name tf.image.resize_bilinear is deprecated. Please use tf.compat.v1.image.resize_bilinear instead.

W0623 11:18:03.216465 140174189094656 deprecation.py:323] From /home/ubuntu/Cabin/DeepLab/tensorflow_deeplab_v3_plus/deeplab_model.py:178: py_func (from tensorflow.python.ops.script_ops) is deprecated and will be removed in a future version.
Instructions for updating:
tf.py_func is deprecated in TF V2. Instead, there are two
    options available in V2.
    - tf.py_function takes a python function which manipulates tf eager
    tensors instead of numpy arrays. It's easy to convert a tf eager tensor to
    an ndarray (just call tensor.numpy()) but having access to eager tensors
    means `tf.py_function`s can use accelerators such as GPUs as well as
    being differentiable using a gradient tape.
    - tf.numpy_function maintains the semantics of the deprecated tf.py_func
    (it is not differentiable, and manipulates numpy arrays). It drops the
    stateful argument making all functions stateful.

W0623 11:18:03.848108 140174189094656 deprecation.py:323] From /home/ubuntu/Cabin/Cabin/lib/python3.5/site-packages/tensorflow/python/ops/array_ops.py:1354: add_dispatch_support.<locals>.wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
2019-06-23 11:18:04.924563: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-06-23 11:18:04.998014: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-06-23 11:18:04.998832: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7ab3a30 executing computations on platform CUDA. Devices:
2019-06-23 11:18:04.998864: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): Tesla K80, Compute Capability 3.7
2019-06-23 11:18:05.020871: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2300055000 Hz
2019-06-23 11:18:05.021623: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7b7fda0 executing computations on platform Host. Devices:
2019-06-23 11:18:05.021653: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): <undefined>, <undefined>
2019-06-23 11:18:05.021919: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-06-23 11:18:05.022751: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties: 
name: Tesla K80 major: 3 minor: 7 memoryClockRate(GHz): 0.8235
pciBusID: 0000:00:1e.0
2019-06-23 11:18:05.022824: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2019-06-23 11:18:05.022866: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
2019-06-23 11:18:05.022889: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10.0
2019-06-23 11:18:05.022952: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10.0
2019-06-23 11:18:05.022989: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10.0
2019-06-23 11:18:05.023012: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10.0
2019-06-23 11:18:05.023040: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2019-06-23 11:18:05.023106: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-06-23 11:18:05.023844: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-06-23 11:18:05.024511: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
2019-06-23 11:18:05.025461: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2019-06-23 11:18:05.028172: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-06-23 11:18:05.028201: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187]      0 
2019-06-23 11:18:05.028214: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0:   N 
2019-06-23 11:18:05.029583: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-06-23 11:18:05.030301: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-06-23 11:18:05.031000: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10805 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:1e.0, compute capability: 3.7)
W0623 11:18:05.032312 140174189094656 deprecation.py:323] From /home/ubuntu/Cabin/Cabin/lib/python3.5/site-packages/tensorflow/python/training/saver.py:1276: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
2019-06-23 11:18:11.533404: W tensorflow/compiler/jit/mark_for_compilation_pass.cc:1412] (One-time warning): Not using XLA:CPU for cluster because envvar TF_XLA_FLAGS=--tf_xla_cpu_global_jit was not set.  If you want XLA:CPU, either set that envvar, or use experimental_jit_scope to enable XLA:CPU.  To confirm that XLA is active, pass --vmodule=xla_compilation_cache=1 (as a proper command-line flag, not via TF_XLA_FLAGS) or set the envvar XLA_FLAGS=--xla_hlo_profile.
2019-06-23 11:18:13.546536: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
Preparing paths...
Paths ready. (2.0742416381835938e-05)
generating: /home/ubuntu/Cabin/ModelOutput/test_front_mask.png
Generated. (1.1920928955078125e-06)
Prediction took: 21.077475786209106
Cropping /home/ubuntu/Cabin/ModelOutput/test_front_mask.png
Cropped and wrote to file. (0.0765237808227539)
Preparing paths...
Paths ready. (2.09808349609375e-05)
generating: /home/ubuntu/Cabin/ModelOutput/test_side_mask.png
Generated. (4.76837158203125e-06)
Prediction took: 0.457857608795166
Cropping /home/ubuntu/Cabin/ModelOutput/test_side_mask.png
Cropped and wrote to file. (0.06001448631286621)
Collecting trashes...
All clear! (0.0003724098205566406)
Evaluation complete. (21.77125883102417)
Measuring...
Measuring complete. (1.4657764434814453)
78.181.181.107 - - [23/Jun/2019 11:18:20] "GET / HTTP/1.1" 200 0

Output for requests after the first one:

78.181.181.107 - - [23/Jun/2019 11:18:20] "GET / HTTP/1.1" 200 0
Request arrived.
Downloading images...
Download complete. (0.24880599975585938)
Preparing images...
Images ready. (0.00023603439331054688)
Saving images...
Images saved. (0.0910639762878418)
Evaluating model...
Preparing list...
List generated (0.00019860267639160156)
Loading images...
Images loaded (0.0002944469451904297)
Inside device
Predicting...
Predictions completed. (6.67572021484375e-06)
Calling zip function...
Zip() complete. (3.0994415283203125e-06)
Zipped: <zip object at 0x7f7c709369c8>
Writing output masks...
2019-06-23 11:22:42.036040: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-06-23 11:22:42.036423: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties: 
name: Tesla K80 major: 3 minor: 7 memoryClockRate(GHz): 0.8235
pciBusID: 0000:00:1e.0
2019-06-23 11:22:42.036502: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2019-06-23 11:22:42.036540: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
2019-06-23 11:22:42.036572: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10.0
2019-06-23 11:22:42.036604: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10.0
2019-06-23 11:22:42.036637: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10.0
2019-06-23 11:22:42.036669: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10.0
2019-06-23 11:22:42.036702: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2019-06-23 11:22:42.036776: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-06-23 11:22:42.037106: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-06-23 11:22:42.037385: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
2019-06-23 11:22:42.037430: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-06-23 11:22:42.037448: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187]      0 
2019-06-23 11:22:42.037465: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0:   N 
2019-06-23 11:22:42.037643: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-06-23 11:22:42.037953: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-06-23 11:22:42.038233: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10805 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:1e.0, compute capability: 3.7)
Preparing paths...
Paths ready. (2.3365020751953125e-05)
generating: /home/ubuntu/Cabin/ModelOutput/test_front_mask.png
Generated. (9.5367431640625e-07)
Prediction took: 11.09858751296997
Cropping /home/ubuntu/Cabin/ModelOutput/test_front_mask.png
Cropped and wrote to file. (0.06068730354309082)
Preparing paths...
Paths ready. (2.4557113647460938e-05)
generating: /home/ubuntu/Cabin/ModelOutput/test_side_mask.png
Generated. (0.0004572868347167969)
Prediction took: 0.47649669647216797
Cropping /home/ubuntu/Cabin/ModelOutput/test_side_mask.png
Cropped and wrote to file. (0.06105923652648926)
Collecting trashes...
All clear! (0.000209808349609375)
Evaluation complete. (11.765886068344116)
Measuring...
Measuring complete. (1.4767637252807617)
78.181.181.107 - - [23/Jun/2019 11:22:48] "GET / HTTP/1.1" 200 0

I embeded the inference script inside my own script used to run the server and it is as below (here I donwload the images from a source for testing purposes and the script is not yet fully complete). It creates the GPU device at line 161 while entering the 'for pred_dict, image_path in zipped:' loop:

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import time
import argparse
import os
import glob
from io import BytesIO

import tensorflow as tf
import cv2

import DeepLab.tensorflow_deeplab_v3_plus.deeplab_model as deeplab_model
from DeepLab.tensorflow_deeplab_v3_plus.utils import preprocessing
from DeepLab.tensorflow_deeplab_v3_plus.utils import dataset_util

from PIL import Image
#import matplotlib.pyplot as plt

from tensorflow.python import debug as tf_debug

from bottle import run, post, request, route
import requests

import Cropper
import Measure

parser = argparse.ArgumentParser()

parser.add_argument('--data_dir', type=str, default='/home/ubuntu/Cabin/Data/',
                    help='The directory containing the image data.')

parser.add_argument('--output_dir', type=str, default='/home/ubuntu/Cabin/ModelOutput/',
                    help='Path to the directory to generate the inference results')

parser.add_argument('--infer_data_list', type=str, default='/home/ubuntu/Cabin/images_list.txt',
                    help='Path to the file listing the inferring images.')

parser.add_argument('--model_dir', type=str, default='/home/ubuntu/Cabin/DeepLab/model/',
                    help="Base directory for the model. "
                         "Make sure 'model_checkpoint_path' given in 'checkpoint' file matches "
                         "with checkpoint name.")

parser.add_argument('--base_architecture', type=str, default='resnet_v2_101',
                    choices=['resnet_v2_50', 'resnet_v2_101'],
                    help='The architecture of base Resnet building block.')

parser.add_argument('--output_stride', type=int, default=16,
                    choices=[8, 16],
                    help='Output stride for DeepLab v3. Currently 8 or 16 is supported.')

parser.add_argument('--debug', action='store_true',
                    help='Whether to use debugger to track down bad values during training.')

parser.add_argument('--private_ip', type=str, default='localhost',
                    help='The IP you want to run your server on.')

parser.add_argument('--port', type=int, default=8080,
                    help='The Port you want to run your server on.')

_NUM_CLASSES = 21

FLAGS, unparsed = parser.parse_known_args()

# This part sets all the needed directories
current_path = os.getcwd()
data_path = current_path + "/Data/"
output_path = current_path + "/Output/"
model_path = current_path + "/DeepLab/model/"
inference_path = current_path + "/DeepLab/tensorflow_deeplab_v3_plus/inference.py"
image_list_dir = current_path + "/images_list.txt"
model_output_path = current_path + "/ModelOutput/"
measure_path = current_path + "/Measure.py"

# Using the Winograd non-fused algorithms provides a small performance boost.
os.environ['TF_ENABLE_WINOGRAD_NONFUSED'] = '1'

pred_hooks = None
if FLAGS.debug:
    debug_hook = tf_debug.LocalCLIDebugHook()
    pred_hooks = [debug_hook]

print("Searching for gpus...")
start = time.time()
gpus = tf.config.experimental.list_physical_devices('GPU')
end = time.time()
print("Found all gpus. ("+ str(end-start) + ")")

print("Generating model...")
start = time.time()
model = tf.estimator.Estimator(
    model_fn=deeplab_model.deeplabv3_plus_model_fn,
    model_dir=FLAGS.model_dir,
    params={
      'output_stride': FLAGS.output_stride,
      'batch_size': 1,  # Batch size must be 1 because the images' size may differ
      'base_architecture': FLAGS.base_architecture,
      'pre_trained_model': None,
      'batch_norm_decay': None,
      'num_classes': _NUM_CLASSES,
    })
end = time.time()
print("Model ready. ("+ str(end-start) + ")")

#print("Generating tensorflow session...")
#start = time.time()
#config = tf.ConfigProto()
#sess = tf.Session(config=config)
#end = time.time()
#print("Session created. ("+ str(end-start) + ")")

def evaluate_model(image_list_dir, inference_path, data_path, model_path, model_output_path):
    print("Preparing list...")
    start = time.time()
    # This part reads looks at the Data folder and writes the name of all files in there into sample_images_list.txt
    imageList = open(image_list_dir, "w")
    for file in os.listdir(data_path):
        imageList.write(str(file)+"\n")
    imageList.close()
    end = time.time()
    print("List generated ("+ str(end-start) + ")")

    print("Loading images...")
    start = time.time()
    # This part runs the model for the current data
    examples = dataset_util.read_examples_list(FLAGS.infer_data_list)
    image_files = [os.path.join(FLAGS.data_dir, filename) for filename in examples]
    end = time.time()
    print("Images loaded ("+ str(end-start) + ")")

    with tf.device("/job:localhost/replica:0/task:0/device:GPU:0"):
        print("Inside device")
        print("Predicting...")
        start = time.time()
        predictions = model.predict(
            input_fn=lambda: preprocessing.eval_input_fn(image_files),
            hooks=pred_hooks)
        end = time.time()
        print("Predictions completed. ("+ str(end-start) + ")")

        output_dir = FLAGS.output_dir
        if not os.path.exists(output_dir):
            os.makedirs(output_dir)

        print("Calling zip function...")
        start = time.time()
        zipped = zip(predictions, image_files)
        end = time.time()
        print("Zip() complete. (" + str(end-start) + ")")

        print("Zipped: " + str(zipped))

        print("Writing output masks...")
        predictionTimeStart = time.time()

        for pred_dict, image_path in zipped:
    #        print("pred_dict is: " + str(pred_dict))

            print("Preparing paths...")
            start = time.time()
            image_basename = os.path.splitext(os.path.basename(image_path))[0]
            output_filename = image_basename + '_mask.png'
            path_to_output = os.path.join(output_dir, output_filename)
            end = time.time()
            print("Paths ready. (" + str(end-start) + ")")

            print("generating:", path_to_output)
            start = time.time()
            mask = pred_dict['decoded_labels']
            end = time.time()
            print("Generated. ("+ str(end-start) + ")")

            # Use this part to also save mask
    #        tmp = Image.fromarray(mask)
    #        plt.axis('off')
    #        plt.imshow(tmp)
    #        plt.savefig(path_to_output, bbox_inches='tight')

            predictionTimeEnd = time.time()
            print("Prediction took: " + str(predictionTimeEnd - predictionTimeStart))

            print("Cropping " + path_to_output)
            start = time.time()
            Cropper.evaluate(path_to_output, cv2.cvtColor(mask, cv2.COLOR_BGR2GRAY))
            end = time.time()
            print("Cropped and wrote to file. ("+ str(end-start) + ")")

            predictionTimeStart = time.time()

        print("Collecting trashes...")
        start = time.time()
        for file in glob.glob(data_path + "*"):
            os.remove(file)
        end = time.time()
        print("All clear! ("+ str(end-start) + ")")

@route('/')#@post('/')
def measure():
    print("Request arrived.")
    try:
        # parse input data
#        try:
#            data = request.json()
#        except:
#            raise ValueError
#
#        if data is None:
#            raise ValueError

        # extract and validate name
        try:
            id = "test"#data['id']
            front_image_url = "https://static1.squarespace.com/static/55b4a361e4b085d388b66c34/t/59709c1903596e8ea44b089e/1501482492586/"#data['front_image_url']
            side_image_url = "https://static1.squarespace.com/static/55b4a361e4b085d388b66c34/t/59709c1903596e8ea44b089e/1501482492586/"#data['side_image_url']
            height = 173#data['height']
            angle = 0#data['angle']
        except (TypeError, KeyError):
            raise ValueError

    except KeyError:
        # if name already exists, return 409 Conflict
        response.status = 409
        return

    try:
        print("Downloading images...")
        start = time.time()
        downloaded_front_image = requests.get(front_image_url)
        downloaded_side_image = requests.get(side_image_url)
        end = time.time()
        print("Download complete. ("+ str(end-start) + ")")
    except(FileNotFoundError, PermissionError, TimeoutError):
        raise ValueError

    print("Preparing images...")
    start = time.time()
    front_image = Image.open(BytesIO(downloaded_front_image.content))
    side_image = Image.open(BytesIO(downloaded_side_image.content))
    end = time.time()
    print("Images ready. ("+ str(end-start) + ")")

    print("Saving images...")
    start = time.time()
    front_image_name = data_path + str(id) + '_front.jpg'
    side_image_name = data_path + str(id) + '_side.jpg'

    front_image.save(front_image_name)
    side_image.save(side_image_name)
    end = time.time()
    print("Images saved. ("+ str(end-start) + ")")

    print("Evaluating model...")
    modelstart = time.time()
    evaluate_model(image_list_dir, inference_path, data_path, model_path, model_output_path)
    modelend = time.time()
    print("Evaluation complete. ("+ str(modelend-modelstart) + ")")

    print("Measuring...")
    start = time.time()
    Measure.evaluate(model_output_path + str(id) + "_front_mask_cropped.png", model_output_path + str(id) + "_side_mask_cropped.png", height, angle, id)
    end = time.time()
    print("Measuring complete. (" + str(end-start) + ")")

    pass

run(host=FLAGS.private_ip, port=FLAGS.port)

DrSleep / tensorflow-deeplab-resnet

Creating a new Tensorflow Device everythime I run the model #208