Closed chenyuZha closed 7 years ago
This question is better asked on StackOverflow since it is not a bug or feature request. There is also a larger community that reads questions there. Thanks!
@chenyuZha I had similar issue. Reason was that I tried to run multi-gpu while my batchsize in the config file was still set at 1. If you run --num_clones 2
, your batch size in the config file must be at least 2 or in incrments of 2.
darraghdog is correct regarding batch size.
Also, if you are still having issues, try adding --ps_tasks=1 to your list of arguments for train.py (putting it right after the num_clones argument should work). This works for me when I run ssd_inception_v2_coco using TF runtime 1.6 with python 2.7 on ubuntu 16.04. I haven't tried the particular model you are using.
When I use model fasterRcnn_inception_resnet_v2 with my own data for training, I set
--num_clones=2
to use my 2 GPUs. But I got the error below:File "/home/zha/Documents/models-master/object_detection/trainer.py", line 117, in _create_losses ) = _get_inputs(input_queue, detection_model.num_classes) ValueError: need more than 0 values to unpack
I tested with modelssd
then everything is fine. The version of python is 2.7 and the system isubuntu 16.04
. Could anyone can tell me why I got this error?(Search in stack overflow but no response). Thanks a lot!