Open cosminnicula opened 5 years ago
What was the issue?
you may try to predict more images once ,i also found that the first image always take more time, but other images are ok.
The problem is not in the hardware. The problem is in the code.
If we look at this line of code in detect
function in model.py
:
# Mold inputs to format expected by the neural network
molded_images, image_metas, windows = self.mold_inputs(images)
And this line:
for i, image in enumerate(images):
final_rois, final_class_ids, final_scores, final_masks =\
self.unmold_detections(detections[i], mrcnn_mask[i],
image.shape, molded_images[i].shape,
windows[i])
We can see that some operations are executed to the input images. I have done some tests and, for 1 image of size 4000x1800 (1'88 mb), that operations of mold_inputs and unmold_input together take up to 5 seconds.
I am trying to find a workaround so those operations aren't needed, but I fear I won't because those functions give the shape needed to the network input.
When doing inference on my Tesla P4 GPU, the inference time is ~5.3s.
Here is the python source that I'm using to do the inference:
And the output is:
I changed mrcnn/model.py so that it prints the inference time:
Other environment details below.
I'm using tensorflow-gpu:
NVIDIA drivers and CUDA 9.0 are correctly installed:
Is there something wrong / missing in these steps?