gustavz / realtime_object_detection

Plug and Play Real-Time Object Detection App with Tensorflow and OpenCV
MIT License
281 stars 148 forks source link

Trying to use the GPU on my computer #31

Open sd12832 opened 6 years ago

sd12832 commented 6 years ago

I am currently trying to run this program using my custom built computer. It has a GeForce GTX 1070 with an i7 on board. However, whenever I run my program, I don't see any change in terms of my GPU processing. Tis is indicated through a simple nvidia-smi, which constantly gives me:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 396.24.02              Driver Version: 396.24.02                 |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1070    Off  | 00000000:01:00.0  On |                  N/A |
| 40%   51C    P0    38W / 151W |   1649MiB /  8118MiB |      4%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1319      G   /usr/lib/xorg/Xorg                           989MiB |
|    0      3617      G   compiz                                       414MiB |
|    0      3799      G   ...are/jetbrains-toolbox/jetbrains-toolbox     3MiB |
|    0      4318      G   ...-token=D21D68637A4C960C2EA136F424DD9CBC   240MiB |

My configuration seems to indicate that I am using the split between the CPU and the GPU, so I am not too sure of what is happening here. My configuration lies below:

### Inference Config
VIDEO_INPUT: /home/dcs_user/barberry.mp4                      # Input Must be OpenCV readable
ROS_INPUT: /camera/color/image_raw  # ROS Image Topic
VISUALIZE: True                     # Disable for performance increase
VIS_FPS: True                       # Draw current FPS in the top left Image corner
CPU_ONLY: False                     # CPU Placement for speed test
USE_OPTIMIZED: False                # whether to use the optimized model (only possible if transform with script)
DISCO_MODE: False                   # Secret Disco Visualization Mode
DOWNLOAD_MODEL: False               # Only for Models available at the TF model_zoo

### Testing
IMAGE_PATH: 'test_images'           # path for test_*.py test_images
LIMIT_IMAGES: None                  # if set to None, all images are used
WRITE_TIMELINE: False                # write json timeline file (slows infrence)
SAVE_RESULT: False                  # save detection results to disk
RESULT_PATH: 'test_results'         # path to save detection results
SEQ_MODELS: []                      # List of Models to sequentially test (Default all Models)

### Object_Detection
WIDTH: 600                          # OpenCV Video stream width
HEIGHT: 600                         # OpenCV Video stream height
MAX_FRAMES: 5000                    # only used if visualize==False
FPS_INTERVAL: 5                     # Interval [s] to print fps of the last interval in console
PRINT_INTERVAL: 500                 # intervall [frames] to print detections to console
PRINT_TH: 0.5                       # detection threshold for det_intervall
## speed hack
SPLIT_MODEL: True                   # Splits Model into a GPU and CPU session (currently only works for ssd_mobilenets)
MULTI_THREADING: True               # Additional Split Model Speed up through multi threading
SSD_SHAPE: 300                      # used for the split model algorithm (currently only supports ssd networks trained on 300x300 and 600x600 input)
SPLIT_NODES: ['Postprocessor/convert_scores','Postprocessor/ExpandDims_1']
                                    # hardcoded split points for ssd_mobilenet_v1
## Tracking
USE_TRACKER: False                  # Use a Tracker (currently only works properly WITHOUT split_model)
TRACKER_FRAMES: 20                  # Number of tracked frames between detections
NUM_TRACKERS: 5                     # Max number of objects to track
## Model
OD_MODEL_NAME: 'ssd_mobilenet_v11_coco'
OD_MODEL_PATH: 'models/ssd_mobilenet_v11_coco/{}'
LABEL_PATH: 'rod/data/tf_coco_label_map.pbtxt'
NUM_CLASSES: 90

### DeepLab
ALPHA: 0.3                     # mask overlay factor (also for mask_rcnn)
BBOX: True                     # compute boundingbox in postprocessing
MINAREA: 500                   # min Pixel Area to apply bounding boxes (avoid noise)
## Model
DL_MODEL_NAME: 'deeplabv3_mnv2_pascal_train_aug_2018_01_29'
DL_MODEL_PATH: 'models/deeplabv3_mnv2_pascal_train_aug/{}'
gustavz commented 6 years ago

Hey @sd12832 To be honest i dont know the exact specs programms installed on your PC. Have you been able to run other TensorFlow models on GPU?

First guess: Did you install TensorFlow with GPU support?