NVIDIA-AI-IOT / nanosam

A distilled Segment Anything (SAM) model capable of running real-time with NVIDIA TensorRT
Apache License 2.0
616 stars 52 forks source link

The inference time #25

Closed MrL-CV closed 4 months ago

MrL-CV commented 4 months ago

I noticed that the readme mentioned extremely fast inference time. However, after installing NanoSam on my agx ORIN according to the instructions and using Python's built-in time module to measure the inference time, I obtained significantly different results. What could be the problem? Screenshot from 2024-04-30 01-36-26 and my code is: `# SPDX-FileCopyrightText: Copyright (c) 2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.

SPDX-License-Identifier: Apache-2.0

#

Licensed under the Apache License, Version 2.0 (the "License");

you may not use this file except in compliance with the License.

You may obtain a copy of the License at

#

http://www.apache.org/licenses/LICENSE-2.0

#

Unless required by applicable law or agreed to in writing, software

distributed under the License is distributed on an "AS IS" BASIS,

WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

See the License for the specific language governing permissions and

limitations under the License.

import numpy as np import matplotlib.pyplot as plt import PIL.Image import argparse from nanosam.utils.predictor import Predictor

import time # 程序计时器

if name == "main":

parser = argparse.ArgumentParser()
parser.add_argument("--image_encoder", type=str, default="data/resnet18_image_encoder.engine")
parser.add_argument("--mask_decoder", type=str, default="data/mobile_sam_mask_decoder.engine")
args = parser.parse_args()

# Instantiate TensorRT predictor
init_start = time.time()
predictor = Predictor(
    args.image_encoder,
    args.mask_decoder
)
init_end = time.time()
print("Init_time:",(init_end - init_start)*pow(10,3),"ms.")
time.sleep(1)

# Read image and run image encoder
read_start = time.time()
image = PIL.Image.open("assets/dogs.jpg")
read_end = time.time()
print("read_img_time:",(read_end - read_start)*pow(10,3),"ms.")
time.sleep(1)

encoder_start = time.time()
predictor.set_image(image)
encoder_end = time.time()
print("encoder_img_time:",(encoder_end - encoder_start)*pow(10,3),"ms.")
time.sleep(1)

# Segment using bounding box
bbox = [100, 100, 850, 759]  # x0, y0, x1, y1

points = np.array([
    [bbox[0], bbox[1]],
    [bbox[2], bbox[3]]
])

point_labels = np.array([2, 3])

# Sam
Sam_start = time.time()
mask, _, _ = predictor.predict(points, point_labels)
Sam_end = time.time()
print("Sam_img_time:",(Sam_end - Sam_start)*pow(10,3),"ms.")

mask = (mask[0, 0] > 0).detach().cpu().numpy()

# Draw resykts
plt.imshow(image)
plt.imshow(mask, alpha=0.5)
x = [bbox[0], bbox[2], bbox[2], bbox[0], bbox[0]]
y = [bbox[1], bbox[1], bbox[3], bbox[3], bbox[1]]
plt.plot(x, y, 'g-')
plt.show()
# plt.savefig("data/basic_usage_out.jpg")

`