NVIDIA / OpenSeq2Seq

Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP
https://nvidia.github.io/OpenSeq2Seq
Apache License 2.0
1.54k stars 369 forks source link

Inference with TRT error #336

Closed jianxiangm closed 5 years ago

jianxiangm commented 5 years ago

Hi, I try to use TRT Deepspeech2 inference with "use_trt: True" in config file, but i get an error: ValueError: Dimension must be 3 but is 4 for 'ForwardPass/ds2_encoder/transpose' (op: 'Transpose') with input shapes: [?,?,1], [4]. I check the code in model.py in line 566: input_placeholders = {'source_tensors': [ tf.placeholder(shape=(None, None), dtype=tf.int32, name='input_map1'), tf.placeholder(shape=(None, None), dtype=tf.int32, name='input_map2') ]} loss, outputs = self._build_forward_pass_graph( input_placeholders, gpu_id=gpu_id ) then the input to the ds2_encoder is {'source_tensors': [<tf.Tensor 'input_map1:0' shape=(?, ?) dtype=int32>, <tf.Tensor 'input_map2:0' shape=(?, ?) dtype=int32>]} but the actual input to the ds2_encoder should be: [batchsize, max_seq_length, fea_dim] {'source_tensors': [<tf.Tensor 'IteratorGetNext:0' shape=(32, ?, 96) dtype=float32>, <tf.Tensor 'Reshape_1:0' shape=(32,) dtype=int32>]} Thus i modify the input placeholder: input_placeholders = {'source_tensors': [ tf.placeholder(shape=(None, None, 96), dtype=tf.float32, name='input_map1'), tf.placeholder(shape=(None), dtype=tf.int32, name='input_map2') ]} but still has error in build_encoder (top_layer = tf.reshape(top_layer, [batch_size, -1, fc])) Failed to convert object of type <class 'list'> to Tensor. Contents: [None, -1, 768]. Consider casting elements to a supported type. I am confused and any suggestions? thanks.

borisgin commented 5 years ago

Can you check if this error exists when you run OpenSeq2Seq inside NVIDIA container, as described in https://nvidia.github.io/OpenSeq2Seq/html/installation.html , please?

jianxiangm commented 5 years ago

Can you check if this error exists when you run OpenSeq2Seq inside NVIDIA container, as described in https://nvidia.github.io/OpenSeq2Seq/html/installation.html , please?

Hi, I tried to run it inside NVIDIA container, and the error is same: ValueError: Dimension must be 3 but is 4 for 'ForwardPass/ds2_encoder/transpose' (op: 'Transpose') with input shapes: [?,?,1], [4]. root@d139f482fea5:/workspace/nvidia-examples/OpenSeq2Seq#

borisgin commented 5 years ago

Thanks, we will check it with TRT team

trevor-m commented 5 years ago

Hi, thanks for reporting the issue! I've created a PR to fix this problem: https://github.com/NVIDIA/OpenSeq2Seq/pull/341

trevor-m commented 5 years ago

Great, I'm glad the error was fixed! This outcome is somewhat expected. Most of the work in TF-TRT for supporting this type of model has only been recently added. If you use the master branch of tensorflow, then I would expect more of the model to convert (you will have to compile from source). We will hopefully get these features in the NVIDIA containers soon.

Please note that the TF-TRT team hasn't looked at the Deepspeech2 model directly yet, so there may not be much speedup or any.