High inference time using r1.0 and master

harsh-agar commented 6 years ago

Hi @GustavZ The model ran successfully on Jetson TX2 but the inference time was quite slow. I tried both r1.0 branch and the master branch, the inference time were- For master: 18.15, 2.39, 2.62, 2.53 seconds While for r1.0: 22.34, 0.27, 0.17, 0.13 seconds for 4 images respectively. Visualization was switched off. Is there anything I'm missing that makes it this slow?

Thanks

gustavz commented 6 years ago

@harsh-agar

are you using the current master?
how does your config look like?
Did you change code?
which python / openCV /JetPack version are you using

harsh-agar commented 6 years ago

@GustavZ

I tried both master as well as r1.0 and results obtained are shown above

2.This is my config.yml for master

Inference Config

VIDEO_INPUT: 0 # Input Must be OpenCV readable VISUALIZE: True # Disable for performance increase VIS_FPS: True # Draw current FPS in the top left Image corner CPU_ONLY: False # CPU Placement for speed test USE_OPTIMIZED: False # whether to use the optimized model (only possible if transform with script) DISCO_MODE: False # Secret Disco Visualization Mode

Testing

IMAGE_PATH: 'testimages' # path for test*.py test_images LIMIT_IMAGES: None # if set to None, all images are used WRITE_TIMELINE: True # write json timeline file (slows infrence) SAVE_RESULT: False # save detection results to disk RESULT_PATH: 'test_results' # path to save detection results SEQ_MODELS: [] # List of Models to sequentially test (Default all Models)

Object_Detection

WIDTH: 600 # OpenCV only supports 4:3 formats others will be converted HEIGHT: 600 # 600x600 leads to 640x480 MAX_FRAMES: 5000 # only used if visualize==False FPS_INTERVAL: 5 # Interval [s] to print fps of the last interval in console PRINT_INTERVAL: 500 # intervall [frames] to print detections to console PRINT_TH: 0.5 # detection threshold for det_intervall

speed hack

SPLIT_MODEL: True # Splits Model into a GPU and CPU session (currently only works for ssd_mobilenets) SSD_SHAPE: 300 # used for the split model algorithm (currently only supports ssd networks trained on 300x300 and 600x600 input)

Tracking

USE_TRACKER: False # Use a Tracker (currently only works properly WITHOUT split_model) TRACKER_FRAMES: 20 # Number of tracked frames between detections NUM_TRACKERS: 5 # Max number of objects to track

Model

OD_MODEL_NAME: 'ssd_mobilenet_v11_coco' OD_MODEL_PATH: 'models/ssd_mobilenet_v11_coco/{}' LABEL_PATH: 'rod/data/tf_coco_label_map.pbtxt' NUM_CLASSES: 90

DeepLab

ALPHA: 0.3 # mask overlay factor (also for mask_rcnn) BBOX: True # compute boundingbox in postprocessing MINAREA: 500 # min Pixel Area to apply bounding boxes (avoid noise)

Model

DL_MODEL_NAME: 'deeplabv3_mnv2_pascal_train_aug_2018_01_29' DL_MODEL_PATH: 'models/deeplabv3_mnv2_pascal_train_aug/{}'

I did not change the code
Python 2.7 | OpenCV 3.3.1 | Jetpack 3.1

Thanks again

harsh-agar commented 6 years ago

I ran the script test_objectdetection.py and, What I observed is when loading the model it is using GPU, but during detection GPU usage is 0%.

gustavz / realtime_object_detection