yzcjtr / GeoNet

Code for GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose (CVPR 2018)
MIT License
723 stars 181 forks source link

Error when running camera pose testing #16

Closed Yuzz1020 closed 6 years ago

Yuzz1020 commented 6 years ago

I am running Geonet on Tensorflow 1.1, CUDA 8.0 and Ubuntu 16.04.

After training the model using command given in train_rigid mode for pose tasks, I am running camera pose test with command given in the Testing - Camera Pose section. The first command

python geonet_main.py --mode=test_pose --dataset_dir=/path/to/kitti/odom/dataset/ --init_ckpt_file=/path/to/trained/model/ --batch_size=1 --seq_length=5 --pose_test_seq=9 --output_dir=/path/to/save/predictions/

returns the following invalid argument error

Traceback (most recent call last): File "geonet_main.py", line 166, in tf.app.run() File "/home/ubuntu/anaconda3/envs/py27/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "geonet_main.py", line 161, in main test_pose(opt) File "/home/ubuntu/GeoNet/geonet_test_pose.py", line 42, in test_pose saver.restore(sess, opt.init_ckpt_file) File "/home/ubuntu/anaconda3/envs/py27/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1457, in restore {self.saver_def.filename_tensor_name: save_path}) File "/home/ubuntu/anaconda3/envs/py27/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 778, in run run_metadata_ptr) File "/home/ubuntu/anaconda3/envs/py27/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 982, in _run feed_dict_string, options, run_metadata) File "/home/ubuntu/anaconda3/envs/py27/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1032, in _do_run target_list, options, run_metadata) File "/home/ubuntu/anaconda3/envs/py27/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1052, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [1,1,256,24] rhs shape= [1,1,256,12] [[Node: save/Assign_29 = Assign[T=DT_FLOAT, _class=["loc:@pose_net/Conv_7/weights"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/gpu:0"](pose_net/Conv_7/weights, save/RestoreV2_29/_3)]] [[Node: save/RestoreV2_4/_48 = _SendT=DT_FLOAT, client_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_141_save/RestoreV2_4", _device="/job:localhost/replica:0/task:0/cpu:0"]]

Caused by op u'save/Assign_29', defined at: File "geonet_main.py", line 166, in tf.app.run() File "/home/ubuntu/anaconda3/envs/py27/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "geonet_main.py", line 161, in main test_pose(opt) File "/home/ubuntu/GeoNet/geonet_test_pose.py", line 26, in test_pose saver = tf.train.Saver([var for var in tf.model_variables()]) File "/home/ubuntu/anaconda3/envs/py27/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1056, in init self.build() File "/home/ubuntu/anaconda3/envs/py27/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1086, in build restore_sequentially=self._restore_sequentially) File "/home/ubuntu/anaconda3/envs/py27/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 691, in build restore_sequentially, reshape) File "/home/ubuntu/anaconda3/envs/py27/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 419, in _AddRestoreOps assign_ops.append(saveable.restore(tensors, shapes)) File "/home/ubuntu/anaconda3/envs/py27/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 155, in restore self.op.get_shape().is_fully_defined()) File "/home/ubuntu/anaconda3/envs/py27/lib/python2.7/site-packages/tensorflow/python/ops/state_ops.py", line 270, in assign validate_shape=validate_shape) File "/home/ubuntu/anaconda3/envs/py27/lib/python2.7/site-packages/tensorflow/python/ops/gen_state_ops.py", line 47, in assign use_locking=use_locking, name=name) File "/home/ubuntu/anaconda3/envs/py27/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 768, in apply_op op_def=op_def) File "/home/ubuntu/anaconda3/envs/py27/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2336, in create_op original_op=self._default_original_op, op_def=op_def) File "/home/ubuntu/anaconda3/envs/py27/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1228, in init self._traceback = _extract_stack()

InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [1,1,256,24] rhs shape= [1,1,256,12] [[Node: save/Assign_29 = Assign[T=DT_FLOAT, _class=["loc:@pose_net/Conv_7/weights"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/gpu:0"](pose_net/Conv_7/weights, save/RestoreV2_29/_3)]] [[Node: save/RestoreV2_4/_48 = _SendT=DT_FLOAT, client_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_141_save/RestoreV2_4", _device="/job:localhost/replica:0/task:0/cpu:0"]]

Any idea how to fix this error? Any help would be appreciated.

Thanks

Yuzz1020 commented 6 years ago

Sorry for disturb, I found that it was because I have set the seq-length = 3 when training and seq-length = 5 when testing. When fixing both seq-length to be the same, the error is fixed.

xiongdemao commented 6 years ago

I am running camera pose test with command given in the Testing - Camera Pose section. The first command

python geonet_main.py --mode=test_pose --dataset_dir=/path/to/kitti/odom/dataset/ --init_ckpt_file=/path/to/trained/model/ --batch_size=1 --seq_length=5 --pose_test_seq=9 --output_dir=/path/to/save/predictions/

returns the following invalid argument error

DataLossError (see above for traceback): Unable to open table file /home/105/datafile/formatted_data_depth/geonet_posenet/model.data-00000-of-00001: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?

so,may i ask which one should be take in the

--init_ckpt_file=/path_to_trained_model/?

is

--init_ckpt_file=/path_to_trained_model/model.data-00000-of-00001

or

--init_ckpt_file=/path_to_trained_model/model.meta

or

--init_ckpt_file=/path_to_trained_model/model.index

?

yzcjtr commented 6 years ago

Hi @xiongdemao , you should set --init_ckpt_file=/path_to_trained_model/model