mdangschat / ctc-asr

End-to-end trained speech recognition system, based on RNNs and the connectionist temporal classification (CTC) cost function.
MIT License
120 stars 36 forks source link

Input & output of graph #11

Closed ramrahu closed 5 years ago

ramrahu commented 5 years ago

I wanted to know what are the input & output nodes of the graph generated in your code. Could you please provide me this information? Thank you in advance

mdangschat commented 5 years ago

Hi @ramrahu, the networks inputs are batches of spectrograms (e.g. MFCC) and their length, as well as the integer encoded plaintext-labels (in case of training). By default the networks input layer is a 2D convolutional layer, defined in asr/model.py:161, named conv/conv2d. While the output layer is a fully connected layer, defined in asr/model.py:233, named logits/dense. Finally, the logits outputs are decoded into the N most likely integer encoded labels using ctc_beam_search_decoder, see https://www.tensorflow.org/api_docs/python/tf/nn/ctc_beam_search_decoder.

ramrahu commented 5 years ago

ok actually i am trying to build a rnn architecture for speech recognition on a mobile phone. i just wanted an idea how a bdrnn model works on a tensorflow lite app. thought this might be a good reference. so when freezing the graph to a .pb file, i needed this info. i actually trained your model using ds1 option. so this input & output node names remain the same?

mdangschat commented 5 years ago

Since the first deep speech paper used fully connected input layers, the name also changed. dense/dense should be the input layers reference in that case.

https://github.com/mdangschat/ctc-asr/blob/v0.1.0/asr/util/tf_contrib.py#L50

ramrahu commented 5 years ago

ok thank you for your response and help. i may have a couple more doubts. ill get back to you in a while if you dont mind.

ramrahu commented 5 years ago

When I try to deploy in my android app I get this error:

java.lang.IllegalArgumentException: No OpKernel was registered to support Op 'L2Loss' used by {{node dense/dense/kernel/Regularizer/l2_regularizer/L2Loss}}with these attrs: [T=DT_FLOAT] Registered devices: [CPU] Registered kernels:

[[{{node dense/dense/kernel/Regularizer/l2_regularizer/L2Loss}}]] at org.tensorflow.Session.run(Native Method) at org.tensorflow.Session.access$100(Session.java:48) at org.tensorflow.Session$Runner.runHelper(Session.java:314) at org.tensorflow.Session$Runner.run(Session.java:264) at org.tensorflow.contrib.android.TensorFlowInferenceInterface.run(TensorFlowInferenceInterface.java:228) at org.tensorflow.contrib.android.TensorFlowInferenceInterface.run(TensorFlowInferenceInterface.java:197) at org.tensorflow.contrib.android.TensorFlowInferenceInterface.run(TensorFlowInferenceInterface.java:187) at org.tensorflow.demo.SpeechActivity.recognize(SpeechActivity.java:229) at org.tensorflow.demo.SpeechActivity.access$100(SpeechActivity.java:48) at org.tensorflow.demo.SpeechActivity$3.run(SpeechActivity.java:193) Any idea what this is?
mdangschat commented 5 years ago

Sadly I've no experience with TensorFlow Android deployment. If I'd to guess I'd say you are accidentally not using the prediction graph (mode tf.estimator.ModeKeys.PREDICT). Another possibility seems to be missing implementation for float32, as suggested in this issue (see last few comments).

ramrahu commented 5 years ago

ok was actually confused where this node was pointing too: dense/dense/kernel/Regularizer/l2_regularizer have you used a node like this in the code or some l2 loss function? i wasn't able to locate it.

mdangschat commented 5 years ago

It's initialized in asr/model.py:147 and then used for the dense layer in asr/util/tf_contrib.py:56. Since it's imported from tensorflow.contrib, it's possible that this is missing on Android. You could try a non contrib regularizer or regularizer = None.

ramrahu commented 5 years ago

I assume you are talking about this parmeter: kernel_regularizer? instead of setting kernel_regularizer=regularizer, i set it to false?

mdangschat commented 5 years ago

I would change asr/model.py:147 to regularizer = None, since the same regularizer is used again, later.

ramrahu commented 5 years ago

oh ok thank you so much for the help. i'll try it out

ramrahu commented 5 years ago

Hi I tried to re run your model and I get this error:

Traceback (most recent call last):^M
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1334, in _do_call^M
    return fn(*args)^M
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1319, in _run_fn^M
    options, feed_dict, fetch_list, target_list, run_metadata)^M
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun^M
    run_metadata)^M
tensorflow.python.framework.errors_impl.UnknownError: Fail to find the dnn implementation.^M
         [[{{node rnn/cudnn_rnn_relu/cudnn_rnn_relu/CudnnRNNCanonicalToParams}}]]^M
^M
During handling of the above exception, another exception occurred:^M
Traceback (most recent call last):^M
  File "asr/train.py", line 83, in <module>^M
    tf.app.run()^M
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/platform/app.py", line 125, in run^M
    _sys.exit(main(argv))^M
  File "asr/train.py", line 55, in main^M
    estimator.train(input_fn=curriculum_train_input_fn, hooks=None)^M
  File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 358, in train^M
    loss = self._train_model(input_fn, hooks, saving_listeners)^M
  File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1124, in _train_model^M
    return self._train_model_default(input_fn, hooks, saving_listeners)^M
  File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1158, in _train_model_default^M
    saving_listeners)^M
  File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1403, in _train_with_estimator_spec^M
    log_step_count_steps=log_step_count_steps) as mon_sess:^M
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/monitored_session.py", line 508, in MonitoredTrainingSession^M
    stop_grace_period_secs=stop_grace_period_secs)^M
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/monitored_session.py", line 934, in __init__^M
    stop_grace_period_secs=stop_grace_period_secs)^M
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/monitored_session.py", line 648, in __init__^M
    self._sess = _RecoverableSession(self._coordinated_creator)^M
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/monitored_session.py", line 1122, in __init__^M
    _WrappedSession.__init__(self, self._create_session())^M
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/monitored_session.py", line 1127, in _create_session^M
    return self._sess_creator.create_session()^M
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/monitored_session.py", line 805, in create_session^M
    self.tf_sess = self._session_creator.create_session()^M
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/monitored_session.py", line 571, in create_session^M
    init_fn=self._scaffold.init_fn)^M
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/session_manager.py", line 287, in prepare_session^M
    sess.run(init_op, feed_dict=init_feed_dict)^M
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 929, in run^M
    run_metadata_ptr)^M
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1152, in _run^M
    feed_dict_tensor, options, run_metadata)^M
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1328, in _do_run^M
    run_metadata)^M
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1348, in _do_call^M
    raise type(e)(node_def, op, message)^M
tensorflow.python.framework.errors_impl.UnknownError: Fail to find the dnn implementation.^M
         [[node rnn/cudnn_rnn_relu/cudnn_rnn_relu/CudnnRNNCanonicalToParams (defined at /workspace/asr/model.py:215) ]]^M
Caused by op 'rnn/cudnn_rnn_relu/cudnn_rnn_relu/CudnnRNNCanonicalToParams', defined at:^M
  File "asr/train.py", line 83, in <module>^M
    tf.app.run()^M
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/platform/app.py", line 125, in run^M
    _sys.exit(main(argv))^M
  File "asr/train.py", line 55, in main^M
    estimator.train(input_fn=curriculum_train_input_fn, hooks=None)^M
  File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 358, in train^M
    loss = self._train_model(input_fn, hooks, saving_listeners)^M
  File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1124, in _train_model^M
    return self._train_model_default(input_fn, hooks, saving_listeners)^M
  File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1154, in _train_model_default^M
    features, labels, model_fn_lib.ModeKeys.TRAIN, self.config)^M
  File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1112, in _call_model_fn^M
    model_fn_results = self._model_fn(features=features, **kwargs)^M
  File "/workspace/asr/model.py", line 54, in model_fn^M
    spectrogram, spectrogram_length, training=(mode == tf.estimator.ModeKeys.TRAIN))^M
  File "/workspace/asr/model.py", line 215, in inference_fn^M
    output_rnn, _ = rnn(output3)^M
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/layers/base.py", line 530, in __call__^M
    outputs = super(Layer, self).__call__(inputs, *args, **kwargs)^M
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 538, in __call__^M
    self._maybe_build(inputs)^M
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 1603, in _maybe_build^M
    self.build(input_shapes)^M
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/contrib/cudnn_rnn/python/layers/cudnn_rnn.py", line 353, in build^M
    opaque_params_t = self._canonical_to_opaque(weights, biases)^M
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/contrib/cudnn_rnn/python/layers/cudnn_rnn.py", line 476, in _canonical_to_opaque^M
    direction=self._direction)^M
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/contrib/cudnn_rnn/python/ops/cudnn_rnn_ops.py", line 1343, in cudnn_rnn_canonical_to_opaque_params^M
    name=name)^M
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gen_cudnn_rnn_ops.py", line 685, in cudnn_rnn_canonical_to_params^M
    seed=seed, seed2=seed2, name=name)^M
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper^M
    op_def=op_def)^M
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/deprecation.py", line 507, in new_func^M
    return func(*args, **kwargs)^M
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 3300, in create_op^M
    op_def=op_def)^M
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1801, in __init__^M
    self._traceback = tf_stack.extract_stack()^M
^M
UnknownError (see above for traceback): Fail to find the dnn implementation.^M
         [[node rnn/cudnn_rnn_relu/cudnn_rnn_relu/CudnnRNNCanonicalToParams (defined at /workspace/asr/model.py:215) ]]^M
^M
mdangschat commented 5 years ago

I don't know how to deploy a CUDNN trained model on Android. You could try to train it without CUDNN by editing asr/params.py:106.

ramrahu commented 5 years ago

ya i noticed that option. this error is not with android. its when i run the training script. it ran the first time, now when i run it, it throws the error mentioned above.

mdangschat commented 5 years ago

Is this occurring since you set the regularizer to None? I assume you are still trying to train with CUDA enabled? In that case please make sure your CUDA and CUDNN installations are compatible with TensorFlow itself. Additionally you could try to disable CUDA and check if it's working then, as mentioned above.

ramrahu commented 5 years ago

Hi it got fixed on its own. I retrained the model with regularizer=None as you suggested. Will check it out on the app. Thank you