NVIDIA / OpenSeq2Seq

Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP
https://nvidia.github.io/OpenSeq2Seq
Apache License 2.0
1.54k stars 371 forks source link

Failed to execute engine, retrying with native segment for TRTEngineOp_2 #460

Open pratapaprasanna opened 5 years ago

pratapaprasanna commented 5 years ago

Hi all,

I want to use tensorRT for inference and therefore i set the flag use_trt to true but when i try to run inference on my trained model with this
python run.py --config_file=example_configs/speech2text/jasper10x5_LibriSpeech_nvgrad.py --mode=infer --infer_output_file=goutham.txt

and i landed up in this error

*** Processed 1/2 batches
2019-06-10 11:56:00.060508: E tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:333] FP16 inputs are not supported yet!
2019-06-10 11:56:00.060566: W tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:297] Failed to execute engine, retrying with native segment for TRTEngineOp_0
2019-06-10 11:56:00.075305: E tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:333] FP16 inputs are not supported yet!
2019-06-10 11:56:00.075356: W tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:297] Failed to execute engine, retrying with native segment for TRTEngineOp_3
2019-06-10 11:56:00.075781: E tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:333] FP16 inputs are not supported yet!
2019-06-10 11:56:00.075823: W tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:297] Failed to execute engine, retrying with native segment for TRTEngineOp_4
2019-06-10 11:56:00.076262: E tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:333] FP16 inputs are not supported yet!
2019-06-10 11:56:00.076371: W tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:297] Failed to execute engine, retrying with native segment for TRTEngineOp_5
2019-06-10 11:56:00.076984: E tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:333] FP16 inputs are not supported yet!
2019-06-10 11:56:00.077103: W tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:297] Failed to execute engine, retrying with native segment for TRTEngineOp_6
2019-06-10 11:56:00.077678: E tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:333] FP16 inputs are not supported yet!
2019-06-10 11:56:00.077767: W tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:297] Failed to execute engine, retrying with native segment for TRTEngineOp_7
2019-06-10 11:56:00.079553: E tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:333] FP16 inputs are not supported yet!
2019-06-10 11:56:00.079649: W tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:297] Failed to execute engine, retrying with native segment for TRTEngineOp_8
2019-06-10 11:56:00.080088: E tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:333] FP16 inputs are not supported yet!
2019-06-10 11:56:00.080186: W tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:297] Failed to execute engine, retrying with native segment for TRTEngineOp_9
2019-06-10 11:56:00.080642: E tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:333] FP16 inputs are not supported yet!
2019-06-10 11:56:00.080672: W tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:297] Failed to execute engine, retrying with native segment for TRTEngineOp_10
2019-06-10 11:56:00.080963: E tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:333] FP16 inputs are not supported yet!
2019-06-10 11:56:00.080991: W tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:297] Failed to execute engine, retrying with native segment for TRTEngineOp_11
2019-06-10 11:56:00.081295: E tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:333] FP16 inputs are not supported yet!
2019-06-10 11:56:00.081323: W tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:297] Failed to execute engine, retrying with native segment for TRTEngineOp_12
2019-06-10 11:56:00.190411: E tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:333] FP16 inputs are not supported yet!
2019-06-10 11:56:00.190612: W tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:297] Failed to execute engine, retrying with native segment for TRTEngineOp_13
2019-06-10 11:56:00.191311: E tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:333] FP16 inputs are not supported yet!
2019-06-10 11:56:00.191352: W tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:297] Failed to execute engine, retrying with native segment for TRTEngineOp_14
2019-06-10 11:56:00.191771: E tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:333] FP16 inputs are not supported yet!
2019-06-10 11:56:00.191851: W tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:297] Failed to execute engine, retrying with native segment for TRTEngineOp_15
2019-06-10 11:56:00.192284: E tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:333] FP16 inputs are not supported yet!
2019-06-10 11:56:00.192315: W tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:297] Failed to execute engine, retrying with native segment for TRTEngineOp_16
2019-06-10 11:56:00.192721: E tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:333] FP16 inputs are not supported yet!
2019-06-10 11:56:00.192800: W tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:297] Failed to execute engine, retrying with native segment for TRTEngineOp_17
2019-06-10 11:56:00.194687: E tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:333] FP16 inputs are not supported yet!
2019-06-10 11:56:00.194716: W tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:297] Failed to execute engine, retrying with native segment for TRTEngineOp_18
2019-06-10 11:56:00.195075: E tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:333] FP16 inputs are not supported yet!
2019-06-10 11:56:00.195104: W tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:297] Failed to execute engine, retrying with native segment for TRTEngineOp_19
2019-06-10 11:56:00.195488: E tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:333] FP16 inputs are not supported yet!
2019-06-10 11:56:00.195569: W tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:297] Failed to execute engine, retrying with native segment for TRTEngineOp_20
2019-06-10 11:56:00.195994: E tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:333] FP16 inputs are not supported yet!
2019-06-10 11:56:00.196022: W tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:297] Failed to execute engine, retrying with native segment for TRTEngineOp_21
2019-06-10 11:56:00.196374: E tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:333] FP16 inputs are not supported yet!
2019-06-10 11:56:00.196402: W tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:297] Failed to execute engine, retrying with native segment for TRTEngineOp_1
2019-06-10 11:56:00.199283: E tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:333] FP16 inputs are not supported yet!
2019-06-10 11:56:00.199318: W tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:297] Failed to execute engine, retrying with native segment for TRTEngineOp_2
*** Processed 2/2 batches
*** Not enough steps for benchmarking
*** Finished inference

git diff example_configs/speech2text/jasper10x5_LibriSpeech_nvgrad.py
diff --git a/example_configs/speech2text/jasper10x5_LibriSpeech_nvgrad.py b/example_configs/speech2text/jasper10x5_LibriSpeech_nvgrad.py
index 0a1cf09a..46aed2cd 100644
--- a/example_configs/speech2text/jasper10x5_LibriSpeech_nvgrad.py
+++ b/example_configs/speech2text/jasper10x5_LibriSpeech_nvgrad.py
@@ -14,19 +14,21 @@ base_model = Speech2Text

 base_params = {
     "random_seed": 0,
-    "use_horovod": True,
+    "use_horovod": False,
     "num_epochs": 400,
+    "use_trt" : True,

-    "num_gpus": 8,
+    "num_gpus": 1,
     "batch_size_per_gpu": 32,
     "iter_size": 1,

     "save_summaries_steps": 100,
     "print_loss_steps": 10,
     "print_samples_steps": 2200,
+    "load_model": "/bigdata/jasper_bs_32_logs/best_models/",
     "eval_steps": 2200,
     "save_checkpoint_steps": 1100,
-    "logdir": "jasper_log_folder",
+    "logdir": "/bigdata/jasper_bs_32_logs/best_models/",
     "num_checkpoints": 2,

     "optimizer": NovoGrad,
@@ -186,7 +188,7 @@ base_params = {
         "norm_per_feature": True,
         "window": "hanning",
         "precompute_mel_basis": True,
-        "sample_freq": 16000,
+        "sample_freq": 8000,
         "pad_to": 16,
         "dither": 1e-5,
         "backend": "librosa"
@@ -223,7 +225,7 @@ infer_params = {
     "data_layer": Speech2TextDataLayer,
     "data_layer_params": {
         "dataset_files": [
-            "/data/librispeech/librivox-test-clean.csv",
+            "librivox-test-clean.csv",
         ],
         "shuffle": False,
     },

Can anyone please help me if there is anything missing from my side.

And one more thing i trained my model with dtype: "mixed" https://github.com/NVIDIA/OpenSeq2Seq/blob/master/example_configs/speech2text/jasper10x5_LibriSpeech_nvgrad.py#L50

Firstly, im guessing this error if because of that dtype.

Secondly, Can i ignore this error because i got inference for all my inference_dataset.

Please correct me if im wrong.

Thanks pratapa prasanna