ajithAI commented 6 years ago

Description

Hi. I am trying to load Meta Graph for Inference without tf eager execution ( ASR Transformer ) https://tensorflow.github.io/tensor2tensor/tutorials/asr_with_transformer.html I got an Error -> KeyError: u'InfeedEnqueueTuple'

Did I missed any module to import ??

Environment information

TF 1.9. python3

If there is any reference to run inference without Eager Execution, that would be very Helpful.

Error logs:

TraceBack :

KeyError Traceback (most recent call last)

in () 1 sess = tf.Session(config=tf.ConfigProto(allow_soft_placement=True)) ----> 2 saver = tf.train.import_meta_graph("checkpoints/transformer_asr_180214/model.ckpt-230000.meta") /usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.pyc in import_meta_graph(meta_graph_or_file, clear_devices, import_scope, **kwargs) 1958 clear_devices=clear_devices, 1959 import_scope=import_scope, -> 1960 **kwargs) 1961 1962 if meta_graph_def.HasField("saver_def"): /usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/meta_graph.pyc in import_scoped_meta_graph(meta_graph_or_file, clear_devices, graph, import_scope, input_map, unbound_inputs_col_name, restore_collections_predicate) 742 name=(import_scope or scope_to_prepend_to_names), 743 input_map=input_map, --> 744 producer_op_list=producer_op_list) 745 746 # Restores all the other collections. /usr/local/lib/python2.7/dist-packages/tensorflow/python/util/deprecation.pyc in new_func(*args, **kwargs) 430 'in a future version' if date is None else ('after %s' % date), 431 instructions) --> 432 return func(*args, **kwargs) 433 return tf_decorator.make_decorator(func, new_func, 'deprecated', 434 _add_deprecated_arg_notice_to_docstring( /usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/importer.py in import_graph_def(graph_def, input_map, return_elements, name, op_dict, producer_op_list) 389 if producer_op_list is not None: 390 # TODO(skyewm): make a copy of graph_def so we're not mutating the argument? --> 391 _RemoveDefaultAttrs(op_dict, producer_op_list, graph_def) 392 #pass 393 /usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/importer.py in _RemoveDefaultAttrs(op_dict, producer_op_list, graph_def) 156 # Remove any default attr values that aren't in op_def. 157 if node.op in producer_op_dict: --> 158 op_def = op_dict[node.op] 159 producer_op_def = producer_op_dict[node.op] 160 # We make a copy of node.attr to iterate through since we may modify KeyError: u'InfeedEnqueueTuple' ... ```

sxsxsx commented 6 years ago

I have meet the same problem, have you solved it? @ajithAI

ajithAI commented 6 years ago

Yup .. Include following

from tensor2tensor import models from tensor2tensor import problems from tensor2tensor.layers import common_layers from tensor2tensor.utils import trainer_lib from tensor2tensor.utils import t2t_model from tensor2tensor.utils import registry

some module from the above is needed.

After that I ran into some other ..

I am not able to restore the weights . Error is : InvalidArgumentError: No OpKernel was registered to support Op 'InfeedEnqueueTuple' with these attrs. Registered devices: [CPU], Registered kernels:

stefan-falk commented 6 years ago

@ajithAI This is not working for me ..

I am executing

import os

import tensorflow as tf

from tensor2tensor import models
from tensor2tensor import problems
from tensor2tensor.layers import common_layers
from tensor2tensor.utils import trainer_lib
from tensor2tensor.utils import t2t_model
from tensor2tensor.utils import registry

def main():

    latest_ckpt = tf.train.latest_checkpoint('/data/workspaces/git/speech/asr/shell/model/transformer')

    meta_graph_path = latest_ckpt + '.meta'

    with tf.Session() as sess:
        saver = tf.train.import_meta_graph(meta_graph_path)
        saver.restore(sess, latest_ckpt)

    print('All done.')

if __name__ == '__main__':
    main()

but I get:


InvalidArgumentError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Cannot assign a device for operation 'transformer/parallel_0_5/transformer/transformer/body/encoder/layer_0/self_attention/multihead_attention/q/Tensordot/ListDiff': Could not satisfy explicit device specification '/device:GPU:0' because no supported kernel for GPU devices is available.
Registered kernels:
  device='CPU'; T in [DT_STRING]; out_idx in [DT_INT64]
  device='CPU'; T in [DT_STRING]; out_idx in [DT_INT32]
  device='CPU'; T in [DT_DOUBLE]; out_idx in [DT_INT64]
  device='CPU'; T in [DT_DOUBLE]; out_idx in [DT_INT32]
  device='CPU'; T in [DT_FLOAT]; out_idx in [DT_INT64]
  device='CPU'; T in [DT_FLOAT]; out_idx in [DT_INT32]
  device='CPU'; T in [DT_BFLOAT16]; out_idx in [DT_INT64]
  device='CPU'; T in [DT_BFLOAT16]; out_idx in [DT_INT32]
  device='CPU'; T in [DT_HALF]; out_idx in [DT_INT64]
  device='CPU'; T in [DT_HALF]; out_idx in [DT_INT32]
  device='CPU'; T in [DT_INT8]; out_idx in [DT_INT64]
  device='CPU'; T in [DT_INT8]; out_idx in [DT_INT32]
  device='CPU'; T in [DT_UINT8]; out_idx in [DT_INT64]
  device='CPU'; T in [DT_UINT8]; out_idx in [DT_INT32]
  device='CPU'; T in [DT_INT16]; out_idx in [DT_INT64]
  device='CPU'; T in [DT_INT16]; out_idx in [DT_INT32]
  device='CPU'; T in [DT_UINT16]; out_idx in [DT_INT64]
  device='CPU'; T in [DT_UINT16]; out_idx in [DT_INT32]
  device='CPU'; T in [DT_INT32]; out_idx in [DT_INT64]
  device='CPU'; T in [DT_INT32]; out_idx in [DT_INT32]
  device='CPU'; T in [DT_INT64]; out_idx in [DT_INT64]
  device='CPU'; T in [DT_INT64]; out_idx in [DT_INT32]

     [[Node: transformer/parallel_0_5/transformer/transformer/body/encoder/layer_0/self_attention/multihead_attention/q/Tensordot/ListDiff = ListDiff[T=DT_INT32, out_idx=DT_INT32, _device="/device:GPU:0"](transformer/parallel_0_5/transformer/transformer/body/encoder/layer_0/self_attention/multihead_attention/q/Tensordot/range, transformer/parallel_0_5/transformer/transformer/body/encoder/layer_0/self_attention/multihead_attention/q/Tensordot/add_1)]]

Any idea how to fix this? ..

ajithAI commented 6 years ago

This is because the Meta is not fitting with index.

So, I loaded model from tensor2tensor as below ..

problem_name = "librispeech_clean" asr_problem = problems.problem(problem_name) encoders = asr_problem.feature_encoders(None)

model_name = "transformer" hparams_set = "transformer_librispeech_tpu"

hparams = trainer_lib.create_hparams(hparams_set,data_dir="./", problem_name=problem_name) asr_model = registry.model(model_name)(hparams, Modes.PREDICT) inf = asr_model.infer(inputs, beam_size=2, alpha=0.6, decode_length=1)["outputs"]

and I loaded weights from the trained model .. saver = tf.train.Saver() saver.restore(sess,tf.train.latest_checkpoint('checkpoints/transformer_asr_180214/'))

now you can run sess.run()

ajithAI commented 6 years ago

@ajithAI This is not working for me ..

I am executing

import os

import tensorflow as tf

from tensor2tensor import models
from tensor2tensor import problems
from tensor2tensor.layers import common_layers
from tensor2tensor.utils import trainer_lib
from tensor2tensor.utils import t2t_model
from tensor2tensor.utils import registry

def main():

    latest_ckpt = tf.train.latest_checkpoint('/data/workspaces/git/speech/asr/shell/model/transformer')

    meta_graph_path = latest_ckpt + '.meta'

    with tf.Session() as sess:
        saver = tf.train.import_meta_graph(meta_graph_path)
        saver.restore(sess, latest_ckpt)

    print('All done.')

if __name__ == '__main__':
    main()

but I get:


InvalidArgumentError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Cannot assign a device for operation 'transformer/parallel_0_5/transformer/transformer/body/encoder/layer_0/self_attention/multihead_attention/q/Tensordot/ListDiff': Could not satisfy explicit device specification '/device:GPU:0' because no supported kernel for GPU devices is available.
Registered kernels:
  device='CPU'; T in [DT_STRING]; out_idx in [DT_INT64]
  device='CPU'; T in [DT_STRING]; out_idx in [DT_INT32]
  device='CPU'; T in [DT_DOUBLE]; out_idx in [DT_INT64]
  device='CPU'; T in [DT_DOUBLE]; out_idx in [DT_INT32]
  device='CPU'; T in [DT_FLOAT]; out_idx in [DT_INT64]
  device='CPU'; T in [DT_FLOAT]; out_idx in [DT_INT32]
  device='CPU'; T in [DT_BFLOAT16]; out_idx in [DT_INT64]
  device='CPU'; T in [DT_BFLOAT16]; out_idx in [DT_INT32]
  device='CPU'; T in [DT_HALF]; out_idx in [DT_INT64]
  device='CPU'; T in [DT_HALF]; out_idx in [DT_INT32]
  device='CPU'; T in [DT_INT8]; out_idx in [DT_INT64]
  device='CPU'; T in [DT_INT8]; out_idx in [DT_INT32]
  device='CPU'; T in [DT_UINT8]; out_idx in [DT_INT64]
  device='CPU'; T in [DT_UINT8]; out_idx in [DT_INT32]
  device='CPU'; T in [DT_INT16]; out_idx in [DT_INT64]
  device='CPU'; T in [DT_INT16]; out_idx in [DT_INT32]
  device='CPU'; T in [DT_UINT16]; out_idx in [DT_INT64]
  device='CPU'; T in [DT_UINT16]; out_idx in [DT_INT32]
  device='CPU'; T in [DT_INT32]; out_idx in [DT_INT64]
  device='CPU'; T in [DT_INT32]; out_idx in [DT_INT32]
  device='CPU'; T in [DT_INT64]; out_idx in [DT_INT64]
  device='CPU'; T in [DT_INT64]; out_idx in [DT_INT32]

   [[Node: transformer/parallel_0_5/transformer/transformer/body/encoder/layer_0/self_attention/multihead_attention/q/Tensordot/ListDiff = ListDiff[T=DT_INT32, out_idx=DT_INT32, _device="/device:GPU:0"](transformer/parallel_0_5/transformer/transformer/body/encoder/layer_0/self_attention/multihead_attention/q/Tensordot/range, transformer/parallel_0_5/transformer/transformer/body/encoder/layer_0/self_attention/multihead_attention/q/Tensordot/add_1)]]

Any idea how to fix this? ..

Try : sess = tf.Session(config=tf.ConfigProto(allow_soft_placement=True, log_device_placement=True))

Punchwes commented 5 years ago

@ajithAI hello, have you solved the no opkernel error? I have also encountered this problem and could not find a solution.

tensorflow / tensor2tensor

Transformer ASR, not able to import MetaGraph, KeyError: u'InfeedEnqueueTuple' #986

Description

Environment information

Error logs: