tensorflow / models

Models and examples built with TensorFlow
Other
76.81k stars 45.83k forks source link

Tensorflow-gpu utilize only CPU #4282

Closed magick2 closed 6 years ago

magick2 commented 6 years ago

Please go to Stack Overflow for help and support:

http://stackoverflow.com/questions/tagged/tensorflow

Also, please understand that many of the models included in this repository are experimental and research-style code. If you open a GitHub issue, here is our policy:

  1. It must be a bug, a feature request, or a significant problem with documentation (for small docs fixes please send a PR instead).
  2. The form below must be filled out.

Here's why we have that policy: TensorFlow developers respond to issues. We want to focus on work that benefits the whole community, e.g., fixing bugs and adding features. Support only helps individuals. GitHub also notifies thousands of people when issues are filed. We want them to see you communicating an interesting problem, rather than being redirected to Stack Overflow.


What is the top-level directory of the model you are using: Object detection Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes, just to insert a video instead of using the webcam OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows 10 64 bits (Last version) TensorFlow installed from (source or binary): Binary TensorFlow version (use command below): v1.8.0-0-g93bc2e2072' 1.8.0 Bazel version (if compiling from source): N/A CUDA/cuDNN version: CUDA: cuda_9.0.176 cuDNN: cudnn-9.0 GPU model and memory: MSI Geforce GTX 1070 8gb Exact command to reproduce: N/A You can collect some of this information using our environment capture script:

https://github.com/tensorflow/tensorflow/tree/master/tools/tf_env_collect.sh

You can obtain the TensorFlow version with

python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)"

Describe the problem

I have the installed version of tensorflow-GPU however it executes the code with CPU.

Source code / logs

object_detection_tutorial.zip

a819721810 commented 6 years ago

use .bat file and file context like this: set CUDA_VISIBLE_DEVICES=-1 python3 train.py --logtostderr --pipeline_config_path=faster_rcnn_inception_resnet_v2_atrous_coco.config --train_dir=train --num_clones=1

magick2 commented 6 years ago

Thanks for answering. I have executed the .bat but nothing has happened, what should happen? I modified instead of python3 I put py, since this way I execute the commands for python

a819721810 commented 6 years ago

ai xm d n1ltehmzwas n image

set CUDA_VISIBLE_DEVICES=-1 set PYTHONPATH=C:\Project\OpenSourceCodes\models\research;C:\Project\OpenSourceCodes\models\research\slim python C:\Project\OpenSourceCodes\models\research\object_detection\train.py --logtostderr --pipeline_config_path=C:\Project\train_test\faster_rcnn_resnet101_coco.config --train_dir=C:\Project\train_test\train

clear?

qlzh727 commented 6 years ago

If you have one GPU, then you should set CUDA_VISIBLE_DEVICES=1. Set it to "-1" means disable usage of GPU.

CUDA_VISIBLE_DEVICES is a comma separated list, which suppose to be the GPU device ID (from nvidia-smi). Eg, If you have 4 GPU and only want to run you job on the first and third, you can set CUDA_VISIBLE_DEVICES=1,3

If you need to do it in python code, you can do os.environ['CUDA_VISIBLE_DEVICES'] = '1'

magick2 commented 6 years ago

I have tensorflow installed and if I run the train.py file I do not see any error, now if I run it from the .bat file, this appears: ModuleNotFoundError: No module named 'tensorflow' Edit: I tried to use os.environ ['CUDA_VISIBLE_DEVICES'] = '1' inside my code, and it did not solve either. The strange thing is that if I try to execute the file .py I get the error "ModuleNotFoundError: No module named 'tensorflow'", if I execute it in sublimetext or the idle of python this error does not happen ...

qlzh727 commented 6 years ago

I don't think its a tensorflow issue or tensorflow model issue. The root cause is probably your local env setting. Due the limited bandwidth of tensorflow team, please open this on stackoverflow, and probably someone from tf community will help you dubug the issue.

magick2 commented 6 years ago

I even worked with Anaconda3 to test if it was a problem of environment, but it did not solve it. I only have Tensorflow-gpu installed (from pip) I do not see how it can be an environment problem if tensorflow cpu is not installed

a819721810 commented 6 years ago

At first,you were supposed to see my shortsreen, Set CUDA_VISIBLE_DEVICES=-1 means disable usage of GPU,then tensorflow-gpu utilize only CPU,so you should use:os.environ ['CUDA_VISIBLE_DEVICES'] = '-1'. Second:You said:ModuleNotFoundError: No module named 'tensorflow'. Apparently you use Anaconda3 that can not use .bat file to find tensorflow.So, you should stop to use Anaconda3 , you can use .bat or use:os.environ ['CUDA_VISIBLE_DEVICES'] = '-1'. Am i clear?

a819721810 commented 6 years ago

Because Anaconda is env environment, i used to meet this problem . So i abandon Anaconda.Then i can control my computer environment to use .bat file.

magick2 commented 6 years ago

Thanks for the help, I solved the "ModuleNotFoundError". The problem is that when executing the .bat it appears that the following commands are not recognized: "--logtostderr", "- pipeline_config_path", "- train_dir" @a819721810