I am currently trying to run this program using my custom built computer. It has a GeForce GTX 1070 with an i7 on board. However, whenever I run my program, I don't see any change in terms of my GPU processing. Tis is indicated through a simple nvidia-smi, which constantly gives me:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 396.24.02 Driver Version: 396.24.02 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 1070 Off | 00000000:01:00.0 On | N/A |
| 40% 51C P0 38W / 151W | 1649MiB / 8118MiB | 4% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1319 G /usr/lib/xorg/Xorg 989MiB |
| 0 3617 G compiz 414MiB |
| 0 3799 G ...are/jetbrains-toolbox/jetbrains-toolbox 3MiB |
| 0 4318 G ...-token=D21D68637A4C960C2EA136F424DD9CBC 240MiB |
My configuration seems to indicate that I am using the split between the CPU and the GPU, so I am not too sure of what is happening here. My configuration lies below:
### Inference Config
VIDEO_INPUT: /home/dcs_user/barberry.mp4 # Input Must be OpenCV readable
ROS_INPUT: /camera/color/image_raw # ROS Image Topic
VISUALIZE: True # Disable for performance increase
VIS_FPS: True # Draw current FPS in the top left Image corner
CPU_ONLY: False # CPU Placement for speed test
USE_OPTIMIZED: False # whether to use the optimized model (only possible if transform with script)
DISCO_MODE: False # Secret Disco Visualization Mode
DOWNLOAD_MODEL: False # Only for Models available at the TF model_zoo
### Testing
IMAGE_PATH: 'test_images' # path for test_*.py test_images
LIMIT_IMAGES: None # if set to None, all images are used
WRITE_TIMELINE: False # write json timeline file (slows infrence)
SAVE_RESULT: False # save detection results to disk
RESULT_PATH: 'test_results' # path to save detection results
SEQ_MODELS: [] # List of Models to sequentially test (Default all Models)
### Object_Detection
WIDTH: 600 # OpenCV Video stream width
HEIGHT: 600 # OpenCV Video stream height
MAX_FRAMES: 5000 # only used if visualize==False
FPS_INTERVAL: 5 # Interval [s] to print fps of the last interval in console
PRINT_INTERVAL: 500 # intervall [frames] to print detections to console
PRINT_TH: 0.5 # detection threshold for det_intervall
## speed hack
SPLIT_MODEL: True # Splits Model into a GPU and CPU session (currently only works for ssd_mobilenets)
MULTI_THREADING: True # Additional Split Model Speed up through multi threading
SSD_SHAPE: 300 # used for the split model algorithm (currently only supports ssd networks trained on 300x300 and 600x600 input)
SPLIT_NODES: ['Postprocessor/convert_scores','Postprocessor/ExpandDims_1']
# hardcoded split points for ssd_mobilenet_v1
## Tracking
USE_TRACKER: False # Use a Tracker (currently only works properly WITHOUT split_model)
TRACKER_FRAMES: 20 # Number of tracked frames between detections
NUM_TRACKERS: 5 # Max number of objects to track
## Model
OD_MODEL_NAME: 'ssd_mobilenet_v11_coco'
OD_MODEL_PATH: 'models/ssd_mobilenet_v11_coco/{}'
LABEL_PATH: 'rod/data/tf_coco_label_map.pbtxt'
NUM_CLASSES: 90
### DeepLab
ALPHA: 0.3 # mask overlay factor (also for mask_rcnn)
BBOX: True # compute boundingbox in postprocessing
MINAREA: 500 # min Pixel Area to apply bounding boxes (avoid noise)
## Model
DL_MODEL_NAME: 'deeplabv3_mnv2_pascal_train_aug_2018_01_29'
DL_MODEL_PATH: 'models/deeplabv3_mnv2_pascal_train_aug/{}'
I am currently trying to run this program using my custom built computer. It has a GeForce GTX 1070 with an i7 on board. However, whenever I run my program, I don't see any change in terms of my GPU processing. Tis is indicated through a simple
nvidia-smi
, which constantly gives me:My configuration seems to indicate that I am using the split between the CPU and the GPU, so I am not too sure of what is happening here. My configuration lies below: