dBeker / Faster-RCNN-TensorFlow-Python3

Tensorflow Faster R-CNN for Windows/Linux and Python 3 (3.5/3.6/3.7)
MIT License
612 stars 329 forks source link

Can this project use gpu? How? #107

Closed jaron771 closed 5 years ago

morpheusthewhite commented 5 years ago

Install tensorflow-gpu instead of tensorflow (You'll probably also need CUDA if you haven't installed it yet)

https://www.tensorflow.org/install/gpu

jaron771 commented 5 years ago

Should I uninstall tensorflow before install tensorflow-gpu?

morpheusthewhite commented 5 years ago

Yes

jaron771 commented 5 years ago

Thanks. The ten-gpu has been installed. Should I edit the source code?

morpheusthewhite commented 5 years ago

Nope, it is good as it is

jaron771 commented 5 years ago

This is my console output while running train.py:

**2019-09-20 21:59:16.562573: W tensorflow/core/common_runtime/colocation_graph.cc:1016] Failed to place the graph without changing the devices of some resources. Some of the operations (that had to be colocated with resource generating operations) are not supported on the resources' devices. Current candidate devices are [ /job:localhost/replica:0/task:0/device:CPU:0]. See below for details of this colocation group: Colocation Debug Info: Colocation group had the following types and supported devices: Root Member(assigned_device_nameindex=-1 requested_devicename='/device:GPU:0' assigned_devicename='' resource_devicename='/device:GPU:0' supported_devicetypes=[CPU] possibledevices=[] Assign: CPU Identity: CPU XLA_CPU XLA_GPU VariableV2: CPU Mul: CPU XLA_CPU XLA_GPU Add: CPU XLA_CPU XLA_GPU Sub: CPU XLA_CPU XLA_GPU RandomUniform: CPU XLA_CPU XLA_GPU Const: CPU XLA_CPU XLA_GPU

Colocation members, user-requested devices, and framework assigned devices, if any: Fix_VGG16/fc7_conv/Initializer/random_uniform/shape (Const) Fix_VGG16/fc7_conv/Initializer/random_uniform/min (Const) Fix_VGG16/fc7_conv/Initializer/random_uniform/max (Const) Fix_VGG16/fc7_conv/Initializer/random_uniform/RandomUniform (RandomUniform) Fix_VGG16/fc7_conv/Initializer/random_uniform/sub (Sub) Fix_VGG16/fc7_conv/Initializer/random_uniform/mul (Mul) Fix_VGG16/fc7_conv/Initializer/random_uniform (Add) Fix_VGG16/fc7_conv (VariableV2) /device:GPU:0 Fix_VGG16/fc7_conv/Assign (Assign) /device:GPU:0 Fix_VGG16/fc7_conv/read (Identity) /device:GPU:0 Fix_VGG16/save/Assign_2 (Assign) /device:GPU:0**

Does it work? The iter speed is now about 6.4s with my 1080ti. With CPU, it's 7.5s. It only increases the speed by 1s/iter....o_o ....

morpheusthewhite commented 5 years ago

It seems that most operations are still delegated to the CPU, that's strange

It seems yto be caused by a mismatch between CUDA and tensorflow version

morpheusthewhite commented 5 years ago

Give a look here https://stackoverflow.com/questions/50622525/which-tensorflow-and-cuda-version-combinations-are-compatible