tensorflow / models

Models and examples built with TensorFlow
77.04k stars 45.77k forks source link

Config difference GPU/TPU #9230

Open turowicz opened 4 years ago

turowicz commented 4 years ago

What is the difference between GPU and TPU configs in the Model Zoo?


I'm worried that all my trainings are going wrong because I'm trying to run locally on my GPU, but the published configs can only be used with a TPU.

mgon5170 commented 4 years ago

I haven't gotten my code to fully run so I'm not 100% certain but I think it handles it outside of the config file. I haven't made any changes to the config file outside of the directory paths and this was the first output I got:

SSD) USER$ python object_detection/model_main_tf2.py --pipeline_config_path=${PIPELINE_CONFIG_PATH} --model_dir=${MODEL_DIR} --alsologtostder 2020-09-10 09:38:48.794458: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA 2020-09-10 09:38:48.806430: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7fb7c6883500 initialized for platform Host (this does not guarantee that XLA will be used). Devices: 2020-09-10 09:38:48.806468: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version WARNING:tensorflow:There are non-GPU devices in tf.distribute.Strategy, not using nccl allreduce. W0910 09:38:48.807556 4723258816 cross_device_ops.py:1175] There are non-GPU devices in tf.distribute.Strategy, not using nccl allreduce. INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:CPU:0',) I0910 09:38:48.807872 4723258816 mirrored_strategy.py:500] Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:CPU:0',)

as far as I can tell the code determines what it should run on. Granted I'm not running on GPU since I'm on a Mac. Hope this helps!

turowicz commented 4 years ago

I get that, but my question was related to individual config values that may have been chosen for TPU use. Otherwise why put TPU in the file name?

mgon5170 commented 4 years ago

I assume it's what they used to generate the model/checkpoints.

turowicz commented 4 years ago

Let me rephrase the question:

"Is it a good idea to take a TPU config and run further training it on GPU, or do we need to change something in that file for better results?