Open blockhunts opened 6 years ago
I have the same error.Do you find how to solve it?
yes, edit this in your config file in ...\models\research\object_detection\training
train_config: {
batch_size: 1
optimizer {
momentum_optimizer: {
learning_rate: {
manual_step_learning_rate {
initial_learning_rate: 0.0002
schedule {
step: 900000
learning_rate: .00002
}
schedule {
step: 1200000
learning_rate: .000002
}
}
}
momentum_optimizer_value: 0.9
}
use_moving_average: false
}
Thank you. It work here :)
On Mon, 4 Jun 2561 at 16:52 blockhunts notifications@github.com wrote:
yes, edit this in your config file in ...\models\research\object_detection\training
train_config: { batch_size: 1 optimizer { momentum_optimizer: { learning_rate: { manual_step_learning_rate { initial_learning_rate: 0.0002 schedule { step: 900000 learning_rate: .00002 } schedule { step: 1200000 learning_rate: .000002 } } } momentum_optimizer_value: 0.9 } use_moving_average: false }
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/EdjeElectronics/TensorFlow-Object-Detection-API-Tutorial-Train-Multiple-Objects-Windows-10/issues/51#issuecomment-394298738, or mute the thread https://github.com/notifications/unsubscribe-auth/AmB6BPsiQVe1w3fV6oSj5jUW-ATlE7pEks5t5QNIgaJpZM4UPjDR .
if you download model from the github repository files are up to date
I ran into this same error while using the AWS DL AMI (Deep Learning AMI (Ubuntu) Version 10.0 (ami-23c4fb46)) and following, as far as I can tell, the same steps I used on Windows with obvious substitutions since this AMI is Ubuntu. Both Ubuntu and Windows are using TF 1.8. But when I use the train_config that blockhunts mentioned I get:
Traceback (most recent call last):
File "/ml/models/research/object_detection/train.py", line 184, in
Any ideas?
I see that epratheeban has the solution to my problem mentioned here https://github.com/EdjeElectronics/TensorFlow-Object-Detection-API-Tutorial-Train-Multiple-Objects-Windows-10/issues/11:
It's easy. Go to the utils folder. Find the learning_schedules.py file. Go to the line 167. And replace the line 167 with below
rate_index = tf.reduce_max(tf.where(tf.greater_equal(global_step, boundaries), list(range(num_boundaries)), [0] * num_boundaries))
Hi @jim-meyer I make this change and the problem solved but now returned this error
WARNING:tensorflow:From C:\Users\sadegh\Anaconda3\envs\tensorflow1\lib\site-pack ages\object_detection-0.1-py3.5.egg\object_detection\core\losses.py:317: softmax _cross_entropy_with_logits (from tensorflow.python.ops.nn_ops) is deprecated and will be removed in a future version. Instructions for updating:
Future major versions of TensorFlow will allow gradients to flow into the labels input on backprop by default.
See @{tf.nn.softmax_cross_entropy_with_logits_v2}.
Traceback (most recent call last):
File "train.py", line 184, in
TypeError: Cannot convert a list containing a tensor of dtype <dtype: 'int32'> to <dtype: 'float32'> (Tensor is: <tf.Tensor 'Preprocessor/stack_1:0' shape=(1, 3) dtype=int32>)
@tamizharasank what file ? this kind of error copy it in google you will find the fix easily
@tamizharasank did you solve this error? I got the same error, any suggesstions?
After making changes in configure file in training folder I got this error:
(tensorflow1) C:\tensorflow1\models\research\object_detection>python train.py --logtostderr --train_dir=training/ --pipeline_config_path=training/faster_rcnn_inception_v2_pets.config WARNING:tensorflow:From C:\Users\kayka\Anaconda3\envs\tensorflow1\lib\site-packages\tensorflow\python\platform\app.py:125: main (from main) is deprecated and will be removed in a future version. Instructions for updating: Use object_detection/model_main.py. WARNING:tensorflow:From C:\Users\kayka\Anaconda3\envs\tensorflow1\lib\site-packages\object_detection-0.1-py3.5.egg\object_detection\legacy\trainer.py:266: create_global_step (from tensorflow.contrib.framework.python.ops.variables) is deprecated and will be removed in a future version. Instructions for updating: Please switch to tf.train.create_global_step WARNING:tensorflow:num_readers has been reduced to 1 to match input file shards. INFO:tensorflow:Scale of 0 disables regularizer. INFO:tensorflow:Scale of 0 disables regularizer. INFO:tensorflow:depth of additional conv before box predictor: 0 WARNING:tensorflow:From C:\Users\kayka\Anaconda3\envs\tensorflow1\lib\site-packages\object_detection-0.1-py3.5.egg\object_detection\predictors\heads\box_head.py:93: calling reduce_mean (from tensorflow.python.ops.math_ops) with keep_dims is deprecated and will be removed in a future version. Instructions for updating: keep_dims is deprecated, use keepdims instead INFO:tensorflow:Scale of 0 disables regularizer. INFO:tensorflow:Scale of 0 disables regularizer. WARNING:tensorflow:From C:\Users\kayka\Anaconda3\envs\tensorflow1\lib\site-packages\object_detection-0.1-py3.5.egg\object_detection\core\losses.py:345: softmax_cross_entropy_with_logits (from tensorflow.python.ops.nn_ops) is deprecated and will be removed in a future version. Instructions for updating:
Future major versions of TensorFlow will allow gradients to flow into the labels input on backprop by default.
See @{tf.nn.softmax_cross_entropy_with_logits_v2}.
C:\Users\kayka\Anaconda3\envs\tensorflow1\lib\site-packages\tensorflow\python\ops\gradients_impl.py:108: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
"Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
WARNING:tensorflow:From C:\Users\kayka\Anaconda3\envs\tensorflow1\lib\site-packages\object_detection-0.1-py3.5.egg\object_detection\meta_architectures\faster_rcnn_meta_arch.py:2236: get_or_create_global_step (from tensorflow.contrib.framework.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Please switch to tf.train.get_or_create_global_step
Traceback (most recent call last):
File "train.py", line 184, in
Looks like you probably did not follow all of the steps in 2a, "Download TensorFlow Object Detection API repository from GitHub" and/or 2b, "Download the Faster-RCNN-Inception-V2-COCO model from TensorFlow's model zoo". Try following those steps again exactly and that should fix your problem.
File "C:\tensorflow1\models\research\object_detection\utils\learning_schedules.py", line 160, in manual_stepping raise ValueError('First step cannot be zero.') ValueError: First step cannot be zero.
i edit the file and save it and when i train it again it's return to it's original value
I'm getting below error while i was trying to run: python train.py --logtostderr --train_dir=training/ --pipeline_config_path=training/ssd_inception_v2_coco.config
WARNING: The TensorFlow contrib module will not be included in TensorFlow 2.0. For more information, please see:
WARNING:tensorflow:From C:\Users\Asus\Miniconda3\lib\site-packages\tensorflow\python\platform\app.py:125: main (from main) is deprecated and will be removed in a future version. Instructions for updating: Use object_detection/model_main.py. WARNING:tensorflow:From C:\Tensorflow\models\research\object_detection\legacy\trainer.py:266: create_global_step (from tensorflow.contrib.framework.python.ops.variables) is deprecated and will be removed in a future version. Instructions for updating: Please switch to tf.train.create_global_step WARNING:tensorflow:From C:\Users\Asus\Miniconda3\lib\site-packages\tensorflow\python\framework\op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version. Instructions for updating: Colocations handled automatically by placer.
Traceback (most recent call last):
File "train.py", line 184, in
@ShubhranshuMaurya that error seems to indicate that there is something wrong with C:\Tensorflow\workspace raining_demonnotations/label_map.pbtxt. Have you opened that file in a text editor to see if it looks right? That file file should look something like this: item { name: 'Class1' id: 1 display_name: 'Class1 Label Name' }
item { name: 'Class2' id: 2 display_name: 'Class2 Label Name' }
IIRC this file could also be a binary protobuf file in which case viewing it in a text editor won't tell you much. But if it appears to be binary perhaps you could try creating a text version with your training labels and see if that works.
ERROR:raise ValueError('First step cannot be zero.') ValueError: First step cannot be zero.
SOLUTION: object_detection\training\ .config
train_config: { batch_size: 1 optimizer { momentum_optimizer: { learning_rate: { manual_step_learning_rate { initial_learning_rate: 0.0002 schedule { step: 900000 learning_rate: .00002 } schedule { step: 1200000 learning_rate: .000002 } } } momentum_optimizer_value: 0.9 } use_moving_average: false }
For me it worked with 'step: 1' for some reason there was 'step: 0'...
TypeError: Cannot convert a list containing a tensor of dtype <dtype: 'int32'> to <dtype: 'float32'> (Tensor is: <tf.Tensor 'Preprocessor/stack_1:0' shape=(1, 3) dtype=int32>)
Did you find a solution?
yes, edit this in your config file in ...\models\research\object_detection\training
train_config: { batch_size: 1 optimizer { momentum_optimizer: { learning_rate: { manual_step_learning_rate { initial_learning_rate: 0.0002 schedule { step: 900000 learning_rate: .00002 } schedule { step: 1200000 learning_rate: .000002 } } } momentum_optimizer_value: 0.9 } use_moving_average: false }
can you explain what is happening in learning rate?, what does the both step size signify in manual learning rate and also what is initial learning rate?
python train.py --logtostderr -train_dir=training/ --pipeline_config_path=training/ssd_mobilenet_v2_quantized_300x300_coco.config
Current thread 0x00005734 (most recent call first): File "C:\Users\EMRE\anaconda3\envs\gpuemre\lib\site-packages\tensorflow_core\python\lib\io\file_io.py", line 84 in _preread_check File "C:\Users\EMRE\anaconda3\envs\gpuemre\lib\site-packages\tensorflow_core\python\lib\io\file_io.py", line 122 in read File "C:\Users\EMRE\anaconda3\envs\gpuemre\lib\site-packages\object_detection-0.1-py3.7.egg\object_detection\utils\label_map_util.py", line 168 in load_labelmap File "C:\Users\EMRE\anaconda3\envs\gpuemre\lib\site-packages\object_detection-0.1-py3.7.egg\object_detection\utils\label_map_util.py", line 201 in get_label_map_dict File "C:\Users\EMRE\anaconda3\envs\gpuemre\lib\site-packages\object_detection-0.1-py3.7.egg\object_detection\data_decoders\tf_example_decoder.py", line 93 in init File "C:\Users\EMRE\anaconda3\envs\gpuemre\lib\site-packages\object_detection-0.1-py3.7.egg\object_detection\data_decoders\tf_example_decoder.py", line 460 in init File "C:\Users\EMRE\anaconda3\envs\gpuemre\lib\site-packages\object_detection-0.1-py3.7.egg\object_detection\builders\decoder_builder.py", line 63 in build File "C:\Users\EMRE\anaconda3\envs\gpuemre\lib\site-packages\object_detection-0.1-py3.7.egg\object_detection\builders\dataset_builder.py", line 209 in build File "train.py", line 123 in get_next File "C:\Users\EMRE\anaconda3\envs\gpuemre\lib\site-packages\object_detection-0.1-py3.7.egg\object_detection\legacy\trainer.py", line 58 in create_input_queue File "C:\Users\EMRE\anaconda3\envs\gpuemre\lib\site-packages\object_detection-0.1-py3.7.egg\object_detection\legacy\trainer.py", line 279 in train File "train.py", line 182 in main File "C:\Users\EMRE\anaconda3\envs\gpuemre\lib\site-packages\tensorflow_core\python\util\deprecation.py", line 324 in new_func File "C:\Users\EMRE\anaconda3\envs\gpuemre\lib\site-packages\absl\app.py", line 258 in _run_main File "C:\Users\EMRE\anaconda3\envs\gpuemre\lib\site-packages\absl\app.py", line 312 in run File "C:\Users\EMRE\anaconda3\envs\gpuemre\lib\site-packages\tensorflow_core\python\platform\app.py", line 40 in run File "train.py", line 186 in
help
i tried to use the same images (card) provided, i just delete all the processed file (csv,dll) and follow all the step. And when i tried to issue python train.py I got this error
Any clues why this happen?