tensorflow / models

Models and examples built with TensorFlow
Other
77.16k stars 45.76k forks source link

ValueError: Tensor conversion requested dtype string for Tensor with dtype float32: 'Tensor("arg0:0", shape=(), dtype=float32, device=/device:CPU:0)' #4909

Closed manu1a closed 6 years ago

manu1a commented 6 years ago

Please go to Stack Overflow for help and support:

http://stackoverflow.com/questions/tagged/tensorflow

Also, please understand that many of the models included in this repository are experimental and research-style code. If you open a GitHub issue, here is our policy:

  1. It must be a bug, a feature request, or a significant problem with documentation (for small docs fixes please send a PR instead).
  2. The form below must be filled out.

Here's why we have that policy: TensorFlow developers respond to issues. We want to focus on work that benefits the whole community, e.g., fixing bugs and adding features. Support only helps individuals. GitHub also notifies thousands of people when issues are filed. We want them to see you communicating an interesting problem, rather than being redirected to Stack Overflow.


System information

You can collect some of this information using our environment capture script:

https://github.com/tensorflow/tensorflow/tree/master/tools/tf_env_collect.sh

You can obtain the TensorFlow version with

python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)"

Describe the problem

It is a bug. I am trying to train my model using model_main.py in the object detection folder. I have tf_record, images and other data setup. I get this error when I run both model_main.py and train.py. Please help me train the model.

Source code / logs

/models/research/object_detection$ python model_main.py --logtostderr --model_dir=training/ --pipeline_config_path=training/ssd_mobilenet_v1_pets.config WARNING:tensorflow:Estimator's model_fn (<function model_fn at 0x7f1bc431f6e0>) includes params argument, but params are not passed to Estimator. WARNING:tensorflow:num_readers has been reduced to 0 to match input file shards. Traceback (most recent call last): File "model_main.py", line 101, in tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 125, in run _sys.exit(main(argv)) File "model_main.py", line 97, in main tf.estimator.train_and_evaluate(estimator, train_spec, eval_specs[0]) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/training.py", line 447, in train_and_evaluate return executor.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/training.py", line 531, in run return self.run_local() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/training.py", line 669, in run_local hooks=train_hooks) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/estimator.py", line 366, in train loss = self._train_model(input_fn, hooks, saving_listeners) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/estimator.py", line 1119, in _train_model return self._train_model_default(input_fn, hooks, saving_listeners) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/estimator.py", line 1129, in _train_model_default input_fn, model_fn_lib.ModeKeys.TRAIN)) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/estimator.py", line 985, in _get_features_and_labels_from_input_fn result = self._call_input_fn(input_fn, mode) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/estimator.py", line 1074, in _call_input_fn return input_fn(*kwargs) File "/home/kumar/Data/rec/Tesnorflow_Beginners/models/research/object_detection/inputs.py", line 412, in _train_input_fn batch_size=params['batch_size'] if params else train_config.batch_size) File "/home/kumar/Data/rec/Tesnorflow_Beginners/models/research/object_detection/builders/dataset_builder.py", line 134, in build config.input_path[:], input_reader_config) File "/home/kumar/Data/rec/Tesnorflow_Beginners/models/research/object_detection/builders/dataset_builder.py", line 80, in read_dataset sloppy=config.shuffle)) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/data/ops/dataset_ops.py", line 1002, in apply dataset = transformation_func(self) File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/data/python/ops/interleave_ops.py", line 88, in _apply_fn buffer_output_elements, prefetch_input_elements) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/data/ops/readers.py", line 130, in init cycle_length, block_length) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/data/ops/dataset_ops.py", line 1988, in init super(InterleaveDataset, self).init(input_dataset, map_func) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/data/ops/dataset_ops.py", line 1957, in init self._map_func.add_to_graph(ops.get_default_graph()) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/function.py", line 475, in add_to_graph self._create_definition_if_needed() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/function.py", line 331, in _create_definition_if_needed self._create_definition_if_needed_impl() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/function.py", line 340, in _create_definition_if_needed_impl self._capture_by_value, self._caller_device) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/function.py", line 804, in func_graph_from_py_func outputs = func(func_graph.inputs) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/data/ops/dataset_ops.py", line 1945, in tf_map_func dataset = map_func(nested_args) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/data/ops/readers.py", line 196, in init filenames = ops.convert_to_tensor(filenames, dtype=dtypes.string) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1011, in convert_to_tensor as_ref=False) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1107, in internal_convert_to_tensor ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 944, in _TensorTensorConversionFunction (dtype.name, t.dtype.name, str(t))) ValueError: Tensor conversion requested dtype string for Tensor with dtype float32: 'Tensor("arg0:0", shape=(), dtype=float32, device=/device:CPU:0)'

baiboat commented 6 years ago

did you fixed this proplem,i got the same error

manu1a commented 6 years ago

Not yet

baiboat commented 6 years ago

i fixed my error,because i forgot to put my .record files under object_detection/data/.

muchensthughs commented 6 years ago

any solution? same here. For me it only occurs in their pet detector tutorial.

karmel commented 6 years ago

Hi @manu1a -- it is hard to help you debug this without further information. Can you clarify all of the steps you took before running model_main? Where is the data in question, and how was that data created? Thanks.

manu1a commented 6 years ago

Hi @karmel , sorry for my late response. I created the tf record using the script provided by the dataset tools folder in the tensorflow object detection API. This problem is solved now by tweaking the folder structure for training data, something similar suggested by @baiboat . Thank you.

manu1a commented 6 years ago

Please let me know if the issue can be closed.

jayshah19949596 commented 6 years ago

I placed my .records file in object_detection/data/ but this did not solve the problem

muchensthughs commented 6 years ago

Matching the number of evaluation samples in config file with the actual number of evaluation samples fixed my problem. The data is participated into 10 shards while I was only using one shard for training and evaluation.

jayshah19949596 commented 6 years ago

Yes, I figured out that I the number in the config file was not matching with my records file. Thanks :)

mingkin commented 6 years ago

I have the same error! my config file as fllow: train_input_reader: { tf_record_input_reader { input_path: "/data/antispam/bmb-python/pynotebook/tf-model/research/object_detection/data/pet_faces_train.record-?????-of-00100" } label_map_path: "/data/antispam/bmb-python/pynotebook/tf-model/research/object_detection/data/pet_label_map.pbtxt" }

eval_config: { num_examples: 8000

Note: The below line limits the evaluation process to 10 evaluations.

Remove the below line to evaluate indefinitely.

max_evals: 10 }

eval_input_reader: { tf_record_input_reader { input_path: "/data/antispam/bmb-python/pynotebook/tf-model/research/object_detection/data/pet_faces_val.record-?????-of-00010" } label_map_path: "/data/antispam/bmb-python/pynotebook/tf-model/research/object_detection/data/pet_faces_label_map.pbtxt" shuffle: false num_readers: 1 }

Is the config wrong?

codinesh1795 commented 6 years ago

Is .record file that you have mentioned same as .tfrecord or different? Because after running generate_tfrecords.py I am getting two files. First "train.tfrecord" and second "eval.tfrecord".

Manish-rai21bit commented 6 years ago

one of the reasons for this error is that the path to the train or test tfrecord is not correct. It is not necessary to maintain the exact directory structure as long as the file paths are correct in the pipeline.config file

junweima commented 6 years ago

I solved it by changing the default 'mscoco_train.record-????-of-00100' to 'coco_train.record-????-of-00100' because the names are different for me when I generated tf records

moinkhan3012 commented 6 years ago

i m confused ,what do u mean by 10 shard please reply asap @muchensthughs

muchensthughs commented 6 years ago

i m confused ,what do u mean by 10 shard please reply asap @muchensthughs

you can try changing the num_examples of eval_config in configuration file to the actual number of evaluation samples you have. The sample code is supposed to be used on 10 shards (pet_faces_val.record-00001-of-00010, pet_faces_val.record-00002-of-00010, etc.) but I was only using one shard so the number of evaluation examples did not match.

moinkhan3012 commented 6 years ago

i had already change the num_example to number of images i have in my test images directory(for custom object) .thank you, but didnt work for me .

moinkhan3012 commented 6 years ago

Any other sloution .how do you @jayshah19949596 .solve the problem can u elaborate it plz

bradknox commented 6 years ago

@baiboat's comment pointed me in the right direction. Specifically, the path to your train.record file needs to be correct in your .config file.

This could be improved by a more helpful error message. In my case, the instructions I was following in this repository have the .config file being written before they recommend a specific directory structure; following the recommendation then requires to go back and change the .config file, so perhaps the instructions should be reordered avoid needing to go back and fix a completed step.

burrbank commented 6 years ago

Had this issue using but solved it by changing the paths in my config file from exact paths to relative paths. WARNING:tensorflow:num_readers has been reduced to 0 to match input file shards. means your paths aren't passing a path pattern matching function.

djovanoski commented 6 years ago

Hit this issue also so just to share how i solved: running model_main.py from research, in config the paths to be relative and if you have shards for example in the path i put "object_detection//train* "

aruna09 commented 6 years ago

I am trying to run train.py and recieving the following error: ValueError: Tensor conversion requested dtype string for Tensor with dtype float32: 'Tensor("arg0:0", shape=(), dtype=float32, device=/device:CPU:0)' All paths in my config file are absolute paths. Did anyone find a solution for this?

burrbank commented 6 years ago

@aruna09 Absolute paths do not work try using a relative path (I ran from models/research so all my dirs were object_detection/...) There is a format checking function that throws away absolute directories.

tsologub commented 6 years ago

For those who want to train the model in Google Cloud with ML Engine and faced this error:

In the config file (in my case: ssd_inception_v2_coco.config) in the section train_input_reader for the input_path --> "gs://{$YOUR_BUCKET_NAME)/data/your_train_record (e.g. train.record)

And in the section eval_input_reader for the input_path --> "gs://{$YOUR_BUCKET_NAME)/data/your_eval_record (e.g. test.record)

alejodosr commented 5 years ago

Just a suggestion, if you a re fine tunning with pets dataset and mobilenet_v1 pretrained on coco, remember to modify ssd_mobilenet_v1_pets.config and not ssd_mobilenet_v1_coco.config

SUSU31 commented 5 years ago

Same error here. I've fixed it by changing the training dataset to correct path and filename. (My training data name is train.tfrecord while I used train.record)

atomsmasher81 commented 5 years ago

my dataset path is correct but i am still getting the same error. :(

GigsyKami commented 5 years ago

@atomsmasher81 hey did you manage to solve the error? im getting the same error my .record file is in the right path

litchi99 commented 5 years ago

The same error for me. I've fixed it by placing my .record and .pbtxt file in object_detection/data/. Besides,if you train your models in windows environment,the input_path of file in pipeline(.config) must use r'your_path' or " \\" or "/". Such as input_path: "E:\\BaiduYunDownload\\models_sysu\\research\\object_detection\\data\\eval.record"

kaushikb11 commented 5 years ago

The error is thrown when the path to the record files is not right. I would recommend adding the whole path to the train and test records file in the data folder in the .config file. Will definitely work and get rid of this particular error.

AkshayaAnil commented 5 years ago

while running from windows , make sure to to use "/" for any configurable path in the pipeline.config file. Could be improved and be windows friendly too.

Gosia199 commented 5 years ago

I've used file from object_detection\samples\configs as .config, and I had the same problem. I have solved this by using pipeline.config (from downloaded file with .tar.gz extension) as pipeline_config_path file. Then I typed into prompt --pipeline_config_path=ssd_mobilenet_v1_coco_2018_01_28/pipeline.config and it works for me.

Suro-One commented 5 years ago

I normally don't comment, but I think this might be helpful.

in your pipeline.config:

add a * to your train and test records. It doesn't seem to accept absolute paths there for some reason.

Example: "gs://my-bucket/data/train.*

kevin-hxq commented 5 years ago

check your train data record and eval data record path in your config.

LInkyBIrd commented 5 years ago

I got the same error in windows 10! I've fixed it by changing '\' to '/' in tfrecord path.

yocheved1 commented 4 years ago

hey, i'm sorry for the fulish question, but everyone here are talking about the '.record' and '.config' files. Currently i cannot find those files in my pycharm program. so the question is how do i find those files to make sure they are in the right place? i'm using: python 3.7 keras 2.3.1 i followed this tutorial and got stuck at the same issue like everyone else here:

Traceback (most recent call last): File "C:/my_app/ImageDetection/weed_detection_algorithms/train_model.py", line 109, in model = MaskRCNN(mode='training', model_dir='Mask_RCNN/', config=config) File "C:\Users\WIN10\Anaconda3\lib\site-packages\mask_rcnn-2.1-py3.7.egg\mrcnn\model.py", line 1837, in init File "C:\Users\WIN10\Anaconda3\lib\site-packages\mask_rcnn-2.1-py3.7.egg\mrcnn\model.py", line 1876, in build File "C:\Users\WIN10\Anaconda3\lib\site-packages\keras\engine\base_layer.py", line 457, in call output = self.call(inputs, kwargs) File "C:\Users\WIN10\Anaconda3\lib\site-packages\keras\layers\core.py", line 687, in call return self.function(inputs, arguments) File "C:\Users\WIN10\Anaconda3\lib\site-packages\mask_rcnn-2.1-py3.7.egg\mrcnn\model.py", line 1876, in File "C:\Users\WIN10\Anaconda3\lib\site-packages\mask_rcnn-2.1-py3.7.egg\mrcnn\model.py", line 2849, in norm_boxes_graph File "C:\Users\WIN10\Anaconda3\lib\site-packages\tensorflow_core\python\ops\math_ops.py", line 897, in binary_op_wrapper with ops.name_scope(None, op_name, [x, y]) as name: File "C:\Users\WIN10\Anaconda3\lib\site-packages\tensorflow_core\python\framework\ops.py", line 6337, in enter g_from_inputs = _get_graph_from_inputs(self._values) File "C:\Users\WIN10\Anaconda3\lib\site-packages\tensorflow_core\python\framework\ops.py", line 5982, in _get_graph_from_inputs _assert_same_graph(original_graph_element, graph_element) File "C:\Users\WIN10\Anaconda3\lib\site-packages\tensorflow_core\python\framework\ops.py", line 5917, in _assert_same_graph (item, original_item)) ValueError: Tensor("lambda_1/Const_1:0", shape=(), dtype=float32) must be from the same graph as Tensor("concat:0", shape=(4,), dtype=float32).

dreamitpossible1 commented 4 years ago

@ atomsmasher81嘿,您设法解决了错误吗?我得到同样的错误 我的.record文件在正确的路径

Have you solved the problem.I have encountered the same problem?

ronithsaju commented 3 years ago

I replaced %tensorflow_version 1.x in my code with !pip install tensorflow==1.15.5 and the error disappeared. Not limited to just version 1.15.5, it works for some other TensorFlow version 1s as well.