markbulgakov commented 6 years ago

I'm trying to Reproduce the results with the pre-trained model and got a mistake:

python ROLO_for_TF/experiments/testing/ROLO_network_test_all.py /usr/lib/python2.7/dist-packages/matplotlib/font_manager.py:273: UserWarning: Matplotlib is building the font cache using fc-list. This may take a moment. warnings.warn('Matplotlib is building the font cache using fc-list. This may take a moment.') ROLO init Utils init self.cfgPath= Default: running ROLO test. Building ROLO graph... Traceback (most recent call last): File "ROLO_for_TF/experiments/testing/ROLO_network_test_all.py", line 275, in <module> main(' ') File "ROLO_for_TF/experiments/testing/ROLO_network_test_all.py", line 271, in main ROLO_TF(argvs) File "ROLO_for_TF/experiments/testing/ROLO_network_test_all.py", line 95, in __init__ self.ROLO(argvs) File "ROLO_for_TF/experiments/testing/ROLO_network_test_all.py", line 235, in ROLO self.build_networks() File "ROLO_for_TF/experiments/testing/ROLO_network_test_all.py", line 127, in build_networks self.lstm_module = self.LSTM_single('lstm_test', self.x, self.istate, self.weights, self.biases) File "ROLO_for_TF/experiments/testing/ROLO_network_test_all.py", line 110, in LSTM_single outputs, state = tf.contrib.rnn.static_rnn(cell, [_X[step]], state) File "/tf15/local/lib/python2.7/site-packages/tensorflow/python/ops/rnn.py", line 1310, in static_rnn (output, state) = call_cell() File "/tf15/local/lib/python2.7/site-packages/tensorflow/python/ops/rnn.py", line 1297, in <lambda> call_cell = lambda: cell(input_, state) File "/tf15/local/lib/python2.7/site-packages/tensorflow/python/ops/rnn_cell_impl.py", line 292, in __call__ *args, **kwargs) File "/tf15/local/lib/python2.7/site-packages/tensorflow/python/layers/base.py", line 652, in __call__ outputs = self.call(inputs, *args, **kwargs) File "/tf15/local/lib/python2.7/site-packages/tensorflow/python/ops/rnn_cell_impl.py", line 770, in call (c_prev, m_prev) = state File "/tf15/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 396, in __iter__ "Tensorobjects are not iterable when eager execution is not " **TypeError:Tensorobjects are not iterable when eager execution is not enabled. To iterate over this tensor usetf.map_fn.**

So the question is what i should do? Please help/

shuchitagupta commented 6 years ago

Did you find a solution to this? Thanks.

wanjinchang commented 6 years ago

I run the project on TensorFlow 1.5 by modified the LSTM_Single function to with tf.device('/gpu:0'):

X, input shape: (batch_size, time_step_size, input_vec_size)

        # XT shape: (time_step_size, batch_size, input_vec_size)
        _X = tf.transpose(_X, [1, 0, 2])  # permute time_step_size and batch_size
        # Reshape to prepare input to hidden activation
        _X = tf.reshape(_X, [self.num_steps * self.batch_size, self.num_input]) # (num_steps*batch_size, num_input)
        # Split data because rnn cell needs a list of inputs for the RNN inner loop
        # Each array shape: (batch_size, num_input)
        _X = tf.split(_X, self.num_steps, 0) # n_steps * (batch_size, num_input)
        print(_X)

    cell = tf.nn.rnn_cell.LSTMCell(self.num_input, self.num_input)
    state = cell.zero_state(self.batch_size, dtype=tf.float32)
    outputs, state = tf.nn.static_rnn(cell, _X, initial_state=state, dtype=tf.float32)
    tf.get_variable_scope().reuse_variables()

but when I run it on the pretrained model, it did not get the correct result of rolo_output_test, the track result is worse..

addvaluejack commented 6 years ago

but when I run it on the pretrained model, it did not get the correct result of rolo_output_test, the track result is worse..

wanjinchang, thanks for your comment. I did run the project on TensorFlow 1.8 by following your instruction. But I could not find the correct pre-trained model for reproducing the result. Which model did you use?

wanjinchang commented 6 years ago

The models which I used is downloaded from the author release from the README.

addvaluejack commented 6 years ago

hi, wanjinchang, yes, I have tried all the models provided in README. But I can not find the correct one that works. It all ends with this error message: Tensor name "rnn/lstm_cell/bias" not found in checkpoint files ../../model/model_demo.ckpt

wanjinchang commented 6 years ago

This error is caused by the version of TensorFlow, you should convert the ckpt to adjust the higher TensorFlow version like the following code.Maybe it can be solved!! `OLD_CHECKPOINT_FILE = "your ckpt path/model_step3_exp3.ckpt" NEW_CHECKPOINT_FILE = "your ckpt path/model_step3_exp3_new.ckpt" import tensorflow as tf vars_to_rename = { "RNN/LSTMCell/W_0": "rnn/lstm_cell/kernel", "RNN/LSTMCell/B": "rnn/lstm_cell/bias", } new_checkpoint_vars = {} reader = tf.train.NewCheckpointReader(OLD_CHECKPOINT_FILE) for old_name in reader.get_variable_to_shape_map(): print(old_name) if old_name in vars_to_rename: new_name = vars_to_rename[old_name] else: new_name = old_name new_checkpoint_vars[new_name] = tf.Variable(reader.get_tensor(old_name))

init = tf.global_variables_initializer() saver = tf.train.Saver(new_checkpoint_vars)

with tf.Session() as sess: sess.run(init) saver.save(sess, NEW_CHECKPOINT_FILE)`

Guanghan / ROLO

Don't work on Tensorflow 1.5.0 #34

X, input shape: (batch_size, time_step_size, input_vec_size)