dusty-nv / jetson-containers

Machine Learning Containers for NVIDIA Jetson and JetPack-L4T
MIT License
1.9k stars 416 forks source link

RIVA doesn't seem to work with DP6 on Orin AGX #450

Open jasonthenderson opened 3 months ago

jasonthenderson commented 3 months ago

Poking around I see that Riva says it is only supported up to 5.1, but there are examples in these containers using it and these containers all work with DP6 so I've been trying to no avail, including doing a full reflash and reinstall. It would be helpful for others if it were clearly noted that it doesn't work with 6 DP so they don't spend time trying to get it to work.

This btw is the error I get.... /opt/riva/bin/start-riva: line 10: curl: command not found /opt/riva/bin/start-riva: line 11: [: -ne: unary operator expected

Triton server is ready... W0324 18:13:59.715973 19 pinned_memory_manager.cc:236] Unable to allocate pinned system memory, pinned memory pool will not be available: CUDA driver version is insufficient for CUDA runtime version I0324 18:13:59.716351 19 cuda_memory_manager.cc:115] CUDA memory pool disabled I0324 18:13:59.724976 19 model_lifecycle.cc:459] loading: conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming:1 I0324 18:13:59.725089 19 model_lifecycle.cc:459] loading: conformer-en-US-asr-streaming-endpointing-streaming:1 I0324 18:13:59.725161 19 model_lifecycle.cc:459] loading: conformer-en-US-asr-streaming-feature-extractor-streaming:1 I0324 18:13:59.725299 19 model_lifecycle.cc:459] loading: intent_slot_detokenizer:1I0324 18:13:59.725239 21 riva_server.cc:126] Using Insecure Server Credentials

I0324 18:13:59.725618 19 model_lifecycle.cc:459] loading: intent_slot_label_tokens_misty:1 I0324 18:13:59.726326 19 model_lifecycle.cc:459] loading: intent_slot_tokenizer-en-US-misty:1 I0324 18:13:59.726721 19 model_lifecycle.cc:459] loading: riva-onnx-fastpitch_encoder-English-US:1 E0324 18:13:59.729030 21 model_registry.cc:286] error: unable to get server status: failed to connect to all addresses; last error: UNKNOWN: ipv4:127.0.0.1:8001: Failed to connect to remote host: Connection refused I0324 18:13:59.732746 19 model_lifecycle.cc:459] loading: riva-punctuation-en-US:1 I0324 18:13:59.734263 19 model_lifecycle.cc:459] loading: riva-trt-conformer-en-US-asr-streaming-am-streaming:1 I0324 18:13:59.734370 19 model_lifecycle.cc:459] loading: riva-trt-hifigan-English-US:1 I0324 18:13:59.734503 19 model_lifecycle.cc:459] loading: riva-trt-riva-punctuation-en-US-nn-bert-base-uncased:1 I0324 18:13:59.734635 19 model_lifecycle.cc:459] loading: riva-trt-riva_intent_misty-nn-bert-base-uncased:1 I0324 18:13:59.734767 19 model_lifecycle.cc:459] loading: spectrogram_chunker-English-US:1 I0324 18:13:59.734963 19 model_lifecycle.cc:459] loading: tts_postprocessor-English-US:1 I0324 18:13:59.735173 19 model_lifecycle.cc:459] loading: tts_preprocessor-English-US:1 I0324 18:14:00.389257 19 endpointing_library.cc:20] TRITONBACKEND_ModelInitialize: conformer-en-US-asr-streaming-endpointing-streaming (version 1) E0324 18:14:00.389530 19 model_lifecycle.cc:596] failed to load 'conformer-en-US-asr-streaming-feature-extractor-streaming' version 1: Invalid argument: instance group conformer-en-US-asr-streaming-feature-extractor-streaming_0 of model conformer-en-US-asr-streaming-feature-extractor-streaming has kind KIND_GPU but no GPUs are available WARNING: Logging before InitGoogleLogging() is written to STDERR W0324 18:14:00.390765 24 parameter_parser.cc:146] Parameter 'chunk_size' set but unused. W0324 18:14:00.390797 24 parameter_parser.cc:146] Parameter 'ms_per_timestep' set but unused. W0324 18:14:00.390801 24 parameter_parser.cc:146] Parameter 'residue_blanks_at_end' set but unused. W0324 18:14:00.390805 24 parameter_parser.cc:146] Parameter 'residue_blanks_at_start' set but unused. W0324 18:14:00.390918 24 parameter_parser.cc:146] Parameter 'start_history' set but unused. W0324 18:14:00.390960 24 parameter_parser.cc:146] Parameter 'start_th' set but unused. W0324 18:14:00.390986 24 parameter_parser.cc:146] Parameter 'stop_history' set but unused. W0324 18:14:00.391013 24 parameter_parser.cc:146] Parameter 'stop_th' set but unused. W0324 18:14:00.391048 24 parameter_parser.cc:146] Parameter 'streaming' set but unused. W0324 18:14:00.391079 24 parameter_parser.cc:146] Parameter 'use_subword' set but unused. W0324 18:14:00.391106 24 parameter_parser.cc:146] Parameter 'vocab_file' set but unused. I0324 18:14:00.392192 19 backend_model.cc:303] model configuration: { "name": "conformer-en-US-asr-streaming-endpointing-streaming", "platform": "", "backend": "riva_asr_endpointing", "version_policy": { "latest": { "num_versions": 1 } }, "max_batch_size": 1, "input": [ { "name": "CLASS_LOGITS", "data_type": "TYPE_FP32", "format": "FORMAT_NONE", "dims": [ -1, 257 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false } ], "output": [ { "name": "SEGMENTS_START_END", "data_type": "TYPE_INT32", "dims": [ -1, 2 ], "label_filename": "", "is_shape_tensor": false } ], "batch_input": [], "batch_output": [], "optimization": { "graph": { "level": 0 }, "priority": "PRIORITY_DEFAULT", "cuda": { "graphs": false, "busy_wait_events": false, "graph_spec": [], "output_copy_stream": true }, "input_pinned_memory": { "enable": true }, "output_pinned_memory": { "enable": true }, "gather_kernel_buffer_threshold": 0, "eager_batching": false }, "sequence_batching": { "max_sequence_idle_microseconds": 60000000, "control_input": [ { "name": "START", "control": [ { "kind": "CONTROL_SEQUENCE_START", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "READY", "control": [ { "kind": "CONTROL_SEQUENCE_READY", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] } ], "state": [] }, "instance_group": [ { "name": "conformer-en-US-asr-streaming-endpointing-streaming_0", "kind": "KIND_CPU", "count": 1, "gpus": [], "secondary_devices": [], "profile": [], "passive": false, "host_policy": "" } ], "default_model_filename": "", "cc_model_filenames": {}, "metric_tags": {}, "parameters": { "streaming": { "string_value": "True" }, "residue_blanks_at_start": { "string_value": "-2" }, "stop_th": { "string_value": "0.98" }, "start_th": { "string_value": "0.2" }, "vocab_file": { "string_value": "/data/models/conformer-en-US-asr-streaming-endpointing-streaming/1/riva_decoder_vocabulary.txt" }, "ms_per_timestep": { "string_value": "40" }, "endpointing_type": { "string_value": "greedy_ctc" }, "stop_history": { "string_value": "800" }, "residue_blanks_at_end": { "string_value": "0" }, "use_subword": { "string_value": "True" }, "start_history": { "string_value": "200" }, "chunk_size": { "string_value": "0.16" } }, "model_warmup": [], "model_transaction_policy": { "decoupled": false } } I0324 18:14:00.392402 19 endpointing_library.cc:24] TRITONBACKEND_ModelInstanceInitialize: conformer-en-US-asr-streaming-endpointing-streaming_0 (device 0) I0324 18:14:00.394024 19 detokenizer_cbe.cc:145] TRITONBACKEND_ModelInitialize: intent_slot_detokenizer (version 1) I0324 18:14:00.394920 19 backend_model.cc:303] model configuration: { "name": "intent_slot_detokenizer", "platform": "", "backend": "riva_nlp_detokenizer", "version_policy": { "latest": { "num_versions": 1 } }, "max_batch_size": 1, "input": [ { "name": "IN_TOKEN_LABELS0", "data_type": "TYPE_STRING", "format": "FORMAT_NONE", "dims": [ -1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "IN_TOKEN_SCORES__1", "data_type": "TYPE_FP32", "format": "FORMAT_NONE", "dims": [ -1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "IN_SEQ_LEN2", "data_type": "TYPE_INT64", "format": "FORMAT_NONE", "dims": [ 1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "IN_TOK_STR3", "data_type": "TYPE_STRING", "format": "FORMAT_NONE", "dims": [ -1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false } ], "output": [ { "name": "OUT_TOKEN_LABELS__0", "data_type": "TYPE_STRING", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "OUT_TOKEN_SCORES1", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "OUT_SEQ_LEN2", "data_type": "TYPE_INT64", "dims": [ 1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "OUT_TOK_STR3", "data_type": "TYPE_STRING", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false } ], "batch_input": [], "batch_output": [], "optimization": { "graph": { "level": 0 }, "priority": "PRIORITY_DEFAULT", "cuda": { "graphs": false, "busy_wait_events": false, "graph_spec": [], "output_copy_stream": true }, "input_pinned_memory": { "enable": true }, "output_pinned_memory": { "enable": true }, "gather_kernel_buffer_threshold": 0, "eager_batching": false }, "instance_group": [ { "name": "intent_slot_detokenizer_0", "kind": "KIND_CPU", "count": 1, "gpus": [], "secondary_devices": [], "profile": [], "passive": false, "host_policy": "" } ], "default_model_filename": "", "cc_model_filenames": {}, "metric_tags": {}, "parameters": {}, "model_warmup": [], "model_transaction_policy": { "decoupled": false } } I0324 18:14:00.395170 19 detokenizer_cbe.cc:147] TRITONBACKEND_ModelInstanceInitialize: intent_slot_detokenizer_0 (device 0) I0324 18:14:00.396184 19 ctc-decoder-library.cc:21] TRITONBACKEND_ModelInitialize: conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming (version 1) W0324 18:14:00.396492 19 pinned_memory_manager.cc:133] failed to allocate pinned system memory: no pinned memory pool, falling back to non-pinned system memory I0324 18:14:00.396833 19 model_lifecycle.cc:693] successfully loaded 'conformer-en-US-asr-streaming-endpointing-streaming' version 1 WARNING: Logging before InitGoogleLogging() is written to STDERR I0324 18:14:00.397576 19 model_lifecycle.cc:693] successfully loaded 'intent_slot_detokenizer' version 1 W0324 18:14:00.397578 23 parameter_parser.cc:146] Parameter 'append_space_to_transcripts' set but unused. W0324 18:14:00.398144 23 parameter_parser.cc:146] Parameter 'beam_size' set but unused. W0324 18:14:00.398149 23 parameter_parser.cc:146] Parameter 'beam_size_token' set but unused. W0324 18:14:00.398152 23 parameter_parser.cc:146] Parameter 'beam_threshold' set but unused. W0324 18:14:00.398155 23 parameter_parser.cc:146] Parameter 'blank_token' set but unused. W0324 18:14:00.398159 23 parameter_parser.cc:146] Parameter 'cased' set but unused. W0324 18:14:00.398162 23 parameter_parser.cc:146] Parameter 'decoder_num_worker_threads' set but unused. W0324 18:14:00.398165 23 parameter_parser.cc:146] Parameter 'forerunner_beam_size' set but unused. W0324 18:14:00.398169 23 parameter_parser.cc:146] Parameter 'forerunner_beam_size_token' set but unused. W0324 18:14:00.398171 23 parameter_parser.cc:146] Parameter 'forerunner_beam_threshold' set but unused. W0324 18:14:00.398175 23 parameter_parser.cc:146] Parameter 'forerunner_use_lm' set but unused. W0324 18:14:00.398178 23 parameter_parser.cc:146] Parameter 'language_model_file' set but unused. W0324 18:14:00.398182 23 parameter_parser.cc:146] Parameter 'lexicon_file' set but unused. W0324 18:14:00.398185 23 parameter_parser.cc:146] Parameter 'lm_weight' set but unused. W0324 18:14:00.398190 23 parameter_parser.cc:146] Parameter 'log_add' set but unused. W0324 18:14:00.398195 23 parameter_parser.cc:146] Parameter 'max_execution_batch_size' set but unused. W0324 18:14:00.398200 23 parameter_parser.cc:146] Parameter 'max_supported_transcripts' set but unused. W0324 18:14:00.398203 23 parameter_parser.cc:146] Parameter 'num_tokenization' set but unused. W0324 18:14:00.398206 23 parameter_parser.cc:146] Parameter 'profane_words_file' set but unused. W0324 18:14:00.398211 23 parameter_parser.cc:146] Parameter 'return_separate_utterances' set but unused. W0324 18:14:00.398216 23 parameter_parser.cc:146] Parameter 'set_default_index_to_unk_token' set but unused. W0324 18:14:00.398219 23 parameter_parser.cc:146] Parameter 'sil_token' set but unused. W0324 18:14:00.398223 23 parameter_parser.cc:146] Parameter 'smearing_mode' set but unused. W0324 18:14:00.398226 23 parameter_parser.cc:146] Parameter 'tokenizer_model' set but unused. W0324 18:14:00.398231 23 parameter_parser.cc:146] Parameter 'unk_score' set but unused. W0324 18:14:00.398234 23 parameter_parser.cc:146] Parameter 'unk_token' set but unused. W0324 18:14:00.398238 23 parameter_parser.cc:146] Parameter 'use_lexicon_free_decoding' set but unused. W0324 18:14:00.398242 23 parameter_parser.cc:146] Parameter 'vocab_file' set but unused. W0324 18:14:00.398247 23 parameter_parser.cc:146] Parameter 'word_insertion_score' set but unused. I0324 18:14:00.399717 19 backend_model.cc:303] model configuration: { "name": "conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming", "platform": "", "backend": "riva_asr_decoder", "version_policy": { "latest": { "num_versions": 1 } }, "max_batch_size": 1024, "input": [ { "name": "CLASS_LOGITS", "data_type": "TYPE_FP32", "format": "FORMAT_NONE", "dims": [ -1, 257 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "END_FLAG", "data_type": "TYPE_UINT32", "format": "FORMAT_NONE", "dims": [ 1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "SEGMENTS_START_END", "data_type": "TYPE_INT32", "format": "FORMAT_NONE", "dims": [ -1, 2 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "CUSTOM_CONFIGURATION", "data_type": "TYPE_STRING", "format": "FORMAT_NONE", "dims": [ -1, 2 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false } ], "output": [ { "name": "FINAL_TRANSCRIPTS", "data_type": "TYPE_STRING", "dims": [ -1, -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "FINAL_TRANSCRIPTS_SCORE", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "FINAL_WORDS_START_END", "data_type": "TYPE_INT32", "dims": [ -1, 2 ], "label_filename": "", "is_shape_tensor": false }, { "name": "PARTIAL_TRANSCRIPTS", "data_type": "TYPE_STRING", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "PARTIAL_TRANSCRIPTS_STABILITY", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "PARTIAL_WORDS_START_END", "data_type": "TYPE_INT32", "dims": [ -1, 2 ], "label_filename": "", "is_shape_tensor": false }, { "name": "FINAL_WORDS_SCORE", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "PARTIAL_WORDS_SCORE", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false } ], "batch_input": [], "batch_output": [], "optimization": { "graph": { "level": 0 }, "priority": "PRIORITY_DEFAULT", "cuda": { "graphs": false, "busy_wait_events": false, "graph_spec": [], "output_copy_stream": true }, "input_pinned_memory": { "enable": true }, "output_pinned_memory": { "enable": true }, "gather_kernel_buffer_threshold": 0, "eager_batching": false }, "sequence_batching": { "oldest": { "max_candidate_sequences": 1024, "preferred_batch_size": [ 32, 64 ], "max_queue_delay_microseconds": 1000 }, "max_sequence_idle_microseconds": 60000000, "control_input": [ { "name": "START", "control": [ { "kind": "CONTROL_SEQUENCE_START", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "READY", "control": [ { "kind": "CONTROL_SEQUENCE_READY", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "END", "control": [ { "kind": "CONTROL_SEQUENCE_END", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "CORRID", "control": [ { "kind": "CONTROL_SEQUENCE_CORRID", "int32_false_true": [], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_UINT64" } ] } ], "state": [] }, "instance_group": [ { "name": "conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming_0", "kind": "KIND_CPU", "count": 1, "gpus": [], "secondary_devices": [], "profile": [], "passive": false, "host_policy": "" } ], "default_model_filename": "", "cc_model_filenames": {}, "metric_tags": {}, "parameters": { "beam_size_token": { "string_value": "16" }, "use_lexicon_free_decoding": { "string_value": "False" }, "forerunner_beam_size": { "string_value": "8" }, "log_add": { "string_value": "True" }, "force_decoder_reset_after_ms": { "string_value": "-1" }, "num_tokenization": { "string_value": "1" }, "language_model_file": { "string_value": "/data/models/conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming/1/4gram-pruned-0_2_7_9-en-lm-set-1.0.bin" }, "forerunner_beam_threshold": { "string_value": "10.0" }, "return_separate_utterances": { "string_value": "False" }, "set_default_index_to_unk_token": { "string_value": "False" }, "word_insertion_score": { "string_value": "1.0" }, "use_subword": { "string_value": "True" }, "tokenizer_model": { "string_value": "/data/models/conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming/1/8b8f095152034e98b24ab33726708bd0_tokenizer.model" }, "max_execution_batch_size": { "string_value": "1" }, "unk_token": { "string_value": "" }, "right_padding_size": { "string_value": "1.92" }, "beam_size": { "string_value": "32" }, "forerunner_beam_size_token": { "string_value": "8" }, "profane_words_file": { "string_value": "/data/models/conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming/1/profane_words_file.txt" }, "lexicon_file": { "string_value": "/data/models/conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming/1/lexicon.txt" }, "append_space_to_transcripts": { "string_value": "True" }, "vocab_file": { "string_value": "/data/models/conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming/1/riva_decoder_vocabulary.txt" }, "smearing_mode": { "string_value": "max" }, "decoder_type": { "string_value": "flashlight" }, "unk_score": { "string_value": "-inf" }, "lm_weight": { "string_value": "0.8" }, "asr_model_delay": { "string_value": "-1" }, "beam_threshold": { "string_value": "20.0" }, "blank_token": { "string_value": "#" }, "ms_per_timestep": { "string_value": "40" }, "max_supported_transcripts": { "string_value": "1" }, "decoder_num_worker_threads": { "string_value": "-1" }, "left_padding_size": { "string_value": "1.92" }, "cased": { "string_value": "False" }, "streaming": { "string_value": "True" }, "chunk_size": { "string_value": "0.16" }, "sil_token": { "string_value": "▁" }, "forerunner_use_lm": { "string_value": "true" } }, "model_warmup": [], "model_transaction_policy": { "decoupled": false } } I0324 18:14:00.409563 19 sequence_label_cbe.cc:137] TRITONBACKEND_ModelInitialize: intent_slot_label_tokens_misty (version 1) I0324 18:14:00.410388 19 backend_model.cc:303] model configuration: { "name": "intent_slot_label_tokens_misty", "platform": "", "backend": "riva_nlp_seqlabel", "version_policy": { "latest": { "num_versions": 1 } }, "max_batch_size": 1, "input": [ { "name": "TOKEN_LOGIT1", "data_type": "TYPE_FP32", "format": "FORMAT_NONE", "dims": [ -1, 65 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false } ], "output": [ { "name": "TOKEN_LABELS0", "data_type": "TYPE_STRING", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "TOKEN_SCORES1", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false } ], "batch_input": [], "batch_output": [], "optimization": { "priority": "PRIORITY_DEFAULT", "input_pinned_memory": { "enable": true }, "output_pinned_memory": { "enable": true }, "gather_kernel_buffer_threshold": 0, "eager_batching": false }, "instance_group": [ { "name": "intent_slot_label_tokens_misty_0", "kind": "KIND_CPU", "count": 1, "gpus": [], "secondary_devices": [], "profile": [], "passive": false, "host_policy": "" } ], "default_model_filename": "", "cc_model_filenames": {}, "metric_tags": {}, "parameters": { "classes": { "string_value": "/data/models/intent_slot_label_tokens_misty/1/slot_labels.csv" } }, "model_warmup": [] } I0324 18:14:00.410581 19 sequence_label_cbe.cc:139] TRITONBACKEND_ModelInstanceInitialize: intent_slot_label_tokens_misty_0 (device 0) I0324 18:14:00.410847 19 tokenizer_library.cc:20] TRITONBACKEND_ModelInitialize: intent_slot_tokenizer-en-US-misty (version 1) WARNING: Logging before InitGoogleLogging() is written to STDERR W0324 18:14:00.411520 28 parameter_parser.cc:146] Parameter 'unk_token' set but unused. W0324 18:14:00.411599 28 parameter_parser.cc:146] Parameter 'vocab' set but unused. I0324 18:14:00.411709 19 backend_model.cc:303] model configuration: { "name": "intent_slot_tokenizer-en-US-misty", "platform": "", "backend": "riva_nlp_tokenizer", "version_policy": { "latest": { "num_versions": 1 } }, "max_batch_size": 1, "input": [ { "name": "INPUT_STR__0", "data_type": "TYPE_STRING", "format": "FORMAT_NONE", "dims": [ 1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false } ], "output": [ { "name": "SEQ0", "data_type": "TYPE_INT32", "dims": [ 128 ], "label_filename": "", "is_shape_tensor": false }, { "name": "MASK__1", "data_type": "TYPE_INT32", "dims": [ 128 ], "label_filename": "", "is_shape_tensor": false }, { "name": "SEGMENT4", "data_type": "TYPE_INT32", "dims": [ 128 ], "label_filename": "", "is_shape_tensor": false }, { "name": "SEQ_LEN2", "data_type": "TYPE_INT64", "dims": [ 1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "TOK_STR__3", "data_type": "TYPE_STRING", "dims": [ 128 ], "label_filename": "", "is_shape_tensor": false } ], "batch_input": [], "batch_output": [], "optimization": { "priority": "PRIORITY_DEFAULT", "input_pinned_memory": { "enable": true }, "output_pinned_memory": { "enable": true }, "gather_kernel_buffer_threshold": 0, "eager_batching": false }, "instance_group": [ { "name": "intent_slot_tokenizer-en-US-misty_0", "kind": "KIND_CPU", "count": 1, "gpus": [], "secondary_devices": [], "profile": [], "passive": false, "host_policy": "" } ], "default_model_filename": "", "cc_model_filenames": {}, "metric_tags": {}, "parameters": { "task": { "string_value": "single_input" }, "tokenizer": { "string_value": "wordpiece" }, "eos_token": { "string_value": "[SEP]" }, "bos_token": { "string_value": "[CLS]" }, "pad_chars_with_spaces": { "string_value": "False" }, "vocab": { "string_value": "/data/models/intent_slot_tokenizer-en-US-misty/1/tokenizer.vocab_file" }, "to_lower": { "string_value": "true" }, "unk_token": { "string_value": "[UNK]" } }, "model_warmup": [] } I0324 18:14:00.411825 19 ctc-decoder-library.cc:25] TRITONBACKEND_ModelInstanceInitialize: conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming_0 (device 0) I0324 18:14:00.412924 19 model_lifecycle.cc:693] successfully loaded 'intent_slot_label_tokens_misty' version 1 I0324 18:14:03.905520 23 ctc-decoder.cc:179] Beam Decoder initialized successfully! I0324 18:14:03.908461 19 onnxruntime.cc:2459] TRITONBACKEND_Initialize: onnxruntime I0324 18:14:03.908613 19 onnxruntime.cc:2469] Triton TRITONBACKEND API version: 1.10 I0324 18:14:03.908682 19 onnxruntime.cc:2475] 'onnxruntime' TRITONBACKEND API version: 1.10 I0324 18:14:03.908747 19 onnxruntime.cc:2505] backend configuration: {"cmdline":{"auto-complete-config":"false","min-compute-capability":"5.300000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} I0324 18:14:03.924653 19 model_lifecycle.cc:693] successfully loaded 'conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming' version 1 I0324 18:14:03.926894 19 tokenizer_library.cc:23] TRITONBACKEND_ModelInstanceInitialize: intent_slot_tokenizer-en-US-misty_0 (device 0) E0324 18:14:03.930276 19 model_lifecycle.cc:596] failed to load 'riva-onnx-fastpitch_encoder-English-US' version 1: Invalid argument: instance group riva-onnx-fastpitch_encoder-English-US_0 of model riva-onnx-fastpitch_encoder-English-US has kind KIND_GPU but no GPUs are available I0324 18:14:03.946244 19 model_lifecycle.cc:693] successfully loaded 'intent_slot_tokenizer-en-US-misty' version 1 I0324 18:14:03.960481 19 pipeline_library.cc:24] TRITONBACKEND_ModelInitialize: riva-punctuation-en-US (version 1) E0324 18:14:03.960477 19 model_lifecycle.cc:596] failed to load 'riva-trt-conformer-en-US-asr-streaming-am-streaming' version 1: Not found: unable to load shared library: /lib/aarch64-linux-gnu/libstdc++.so.6: version GLIBCXX_3.4.29' not found (required by /usr/lib/aarch64-linux-gnu/nvidia/libnvdla_compiler.so) WARNING: Logging before InitGoogleLogging() is written to STDERR W0324 18:14:03.961179 30 parameter_parser.cc:146] Parameter 'attn_mask_tensor_name' set but unused. W0324 18:14:03.961200 30 parameter_parser.cc:146] Parameter 'bos_token' set but unused. W0324 18:14:03.961205 30 parameter_parser.cc:146] Parameter 'capit_logits_tensor_name' set but unused. W0324 18:14:03.961208 30 parameter_parser.cc:146] Parameter 'capitalization_mapping_path' set but unused. W0324 18:14:03.961211 30 parameter_parser.cc:146] Parameter 'delimiter' set but unused. W0324 18:14:03.961215 30 parameter_parser.cc:146] Parameter 'eos_token' set but unused. W0324 18:14:03.961217 30 parameter_parser.cc:146] Parameter 'input_ids_tensor_name' set but unused. W0324 18:14:03.961220 30 parameter_parser.cc:146] Parameter 'language_code' set but unused. W0324 18:14:03.961223 30 parameter_parser.cc:146] Parameter 'load_model' set but unused. W0324 18:14:03.961226 30 parameter_parser.cc:146] Parameter 'model_api' set but unused. W0324 18:14:03.961230 30 parameter_parser.cc:146] Parameter 'model_family' set but unused. W0324 18:14:03.961232 30 parameter_parser.cc:146] Parameter 'model_name' set but unused. W0324 18:14:03.961236 30 parameter_parser.cc:146] Parameter 'pad_chars_with_spaces' set but unused. W0324 18:14:03.961238 30 parameter_parser.cc:146] Parameter 'pipeline_type' set but unused. W0324 18:14:03.961241 30 parameter_parser.cc:146] Parameter 'preserve_accents' set but unused. W0324 18:14:03.961244 30 parameter_parser.cc:146] Parameter 'punct_logits_tensor_name' set but unused. W0324 18:14:03.961248 30 parameter_parser.cc:146] Parameter 'punctuation_mapping_path' set but unused. W0324 18:14:03.961251 30 parameter_parser.cc:146] Parameter 'remove_spaces' set but unused. W0324 18:14:03.961254 30 parameter_parser.cc:146] Parameter 'to_lower' set but unused. W0324 18:14:03.961257 30 parameter_parser.cc:146] Parameter 'token_type_tensor_name' set but unused. W0324 18:14:03.961261 30 parameter_parser.cc:146] Parameter 'tokenizer' set but unused. W0324 18:14:03.961263 30 parameter_parser.cc:146] Parameter 'tokenizer_to_lower' set but unused. W0324 18:14:03.961266 30 parameter_parser.cc:146] Parameter 'unicode_normalize' set but unused. W0324 18:14:03.961269 30 parameter_parser.cc:146] Parameter 'unk_token' set but unused. W0324 18:14:03.961272 30 parameter_parser.cc:146] Parameter 'use_int64_nn_inputs' set but unused. W0324 18:14:03.961275 30 parameter_parser.cc:146] Parameter 'vocab' set but unused. W0324 18:14:03.961333 30 parameter_parser.cc:146] Parameter 'attn_mask_tensor_name' set but unused. W0324 18:14:03.961341 30 parameter_parser.cc:146] Parameter 'bos_token' set but unused. W0324 18:14:03.961345 30 parameter_parser.cc:146] Parameter 'capit_logits_tensor_name' set but unused. W0324 18:14:03.961347 30 parameter_parser.cc:146] Parameter 'capitalization_mapping_path' set but unused. W0324 18:14:03.961350 30 parameter_parser.cc:146] Parameter 'delimiter' set but unused. W0324 18:14:03.961354 30 parameter_parser.cc:146] Parameter 'eos_token' set but unused. W0324 18:14:03.961356 30 parameter_parser.cc:146] Parameter 'input_ids_tensor_name' set but unused. W0324 18:14:03.961359 30 parameter_parser.cc:146] Parameter 'language_code' set but unused. W0324 18:14:03.961364 30 parameter_parser.cc:146] Parameter 'model_api' set but unused. W0324 18:14:03.961365 30 parameter_parser.cc:146] Parameter 'model_family' set but unused. W0324 18:14:03.961369 30 parameter_parser.cc:146] Parameter 'model_name' set but unused. W0324 18:14:03.961371 30 parameter_parser.cc:146] Parameter 'pad_chars_with_spaces' set but unused. W0324 18:14:03.961374 30 parameter_parser.cc:146] Parameter 'preserve_accents' set but unused. W0324 18:14:03.961377 30 parameter_parser.cc:146] Parameter 'punct_logits_tensor_name' set but unused. W0324 18:14:03.961380 30 parameter_parser.cc:146] Parameter 'punctuation_mapping_path' set but unused. W0324 18:14:03.961383 30 parameter_parser.cc:146] Parameter 'remove_spaces' set but unused. W0324 18:14:03.961386 30 parameter_parser.cc:146] Parameter 'to_lower' set but unused. W0324 18:14:03.961390 30 parameter_parser.cc:146] Parameter 'token_type_tensor_name' set but unused. W0324 18:14:03.961392 30 parameter_parser.cc:146] Parameter 'tokenizer' set but unused. W0324 18:14:03.961395 30 parameter_parser.cc:146] Parameter 'tokenizer_to_lower' set but unused. W0324 18:14:03.961398 30 parameter_parser.cc:146] Parameter 'unicode_normalize' set but unused. W0324 18:14:03.961401 30 parameter_parser.cc:146] Parameter 'unk_token' set but unused. W0324 18:14:03.961405 30 parameter_parser.cc:146] Parameter 'use_int64_nn_inputs' set but unused. W0324 18:14:03.961407 30 parameter_parser.cc:146] Parameter 'vocab' set but unused. W0324 18:14:03.961431 30 parameter_parser.cc:146] Parameter 'attn_mask_tensor_name' set but unused. W0324 18:14:03.961436 30 parameter_parser.cc:146] Parameter 'bos_token' set but unused. W0324 18:14:03.961441 30 parameter_parser.cc:146] Parameter 'capit_logits_tensor_name' set but unused. W0324 18:14:03.961442 30 parameter_parser.cc:146] Parameter 'capitalization_mapping_path' set but unused. W0324 18:14:03.961445 30 parameter_parser.cc:146] Parameter 'delimiter' set but unused. W0324 18:14:03.961448 30 parameter_parser.cc:146] Parameter 'eos_token' set but unused. W0324 18:14:03.961452 30 parameter_parser.cc:146] Parameter 'input_ids_tensor_name' set but unused. W0324 18:14:03.961454 30 parameter_parser.cc:146] Parameter 'language_code' set but unused. W0324 18:14:03.961457 30 parameter_parser.cc:146] Parameter 'model_api' set but unused. W0324 18:14:03.961460 30 parameter_parser.cc:146] Parameter 'model_family' set but unused. W0324 18:14:03.961463 30 parameter_parser.cc:146] Parameter 'model_name' set but unused. W0324 18:14:03.961467 30 parameter_parser.cc:146] Parameter 'pad_chars_with_spaces' set but unused. W0324 18:14:03.961469 30 parameter_parser.cc:146] Parameter 'preserve_accents' set but unused. W0324 18:14:03.961473 30 parameter_parser.cc:146] Parameter 'punct_logits_tensor_name' set but unused. W0324 18:14:03.961477 30 parameter_parser.cc:146] Parameter 'punctuation_mapping_path' set but unused. W0324 18:14:03.961479 30 parameter_parser.cc:146] Parameter 'remove_spaces' set but unused. W0324 18:14:03.961483 30 parameter_parser.cc:146] Parameter 'to_lower' set but unused. W0324 18:14:03.961485 30 parameter_parser.cc:146] Parameter 'token_type_tensor_name' set but unused. W0324 18:14:03.961488 30 parameter_parser.cc:146] Parameter 'tokenizer_to_lower' set but unused. W0324 18:14:03.961491 30 parameter_parser.cc:146] Parameter 'unicode_normalize' set but unused. W0324 18:14:03.961494 30 parameter_parser.cc:146] Parameter 'unk_token' set but unused. W0324 18:14:03.961498 30 parameter_parser.cc:146] Parameter 'use_int64_nn_inputs' set but unused. W0324 18:14:03.961500 30 parameter_parser.cc:146] Parameter 'vocab' set but unused. W0324 18:14:03.961565 30 parameter_parser.cc:146] Parameter 'model_api' set but unused. W0324 18:14:03.961572 30 parameter_parser.cc:146] Parameter 'model_family' set but unused. I0324 18:14:03.961635 19 backend_model.cc:303] model configuration: { "name": "riva-punctuation-en-US", "platform": "", "backend": "riva_nlp_pipeline", "version_policy": { "latest": { "num_versions": 1 } }, "max_batch_size": 1, "input": [ { "name": "PIPELINE_INPUT", "data_type": "TYPE_STRING", "format": "FORMAT_NONE", "dims": [ 1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false } ], "output": [ { "name": "PIPELINE_OUTPUT", "data_type": "TYPE_STRING", "dims": [ 1 ], "label_filename": "", "is_shape_tensor": false } ], "batch_input": [], "batch_output": [], "optimization": { "priority": "PRIORITY_DEFAULT", "input_pinned_memory": { "enable": true }, "output_pinned_memory": { "enable": true }, "gather_kernel_buffer_threshold": 0, "eager_batching": false }, "instance_group": [ { "name": "riva-punctuation-en-US_0", "kind": "KIND_CPU", "count": 1, "gpus": [], "secondary_devices": [], "profile": [], "passive": false, "host_policy": "" } ], "default_model_filename": "", "cc_model_filenames": {}, "metric_tags": {}, "parameters": { "token_type_tensor_name": { "string_value": "token_type_ids" }, "to_lower": { "string_value": "true" }, "tokenizer_to_lower": { "string_value": "true" }, "model_name": { "string_value": "riva-trt-riva-punctuation-en-US-nn-bert-base-uncased" }, "attn_mask_tensor_name": { "string_value": "attention_mask" }, "model_api": { "string_value": "/nvidia.riva.nlp.RivaLanguageUnderstanding/PunctuateText" }, "load_model": { "string_value": "false" }, "pipeline_type": { "string_value": "punctuation" }, "model_family": { "string_value": "riva" }, "pad_chars_with_spaces": { "string_value": "False" }, "capit_logits_tensor_name": { "string_value": "capit_logits" }, "vocab": { "string_value": "/data/models/riva-punctuation-en-US/1/f92889b136d2433693cb9127e1aea218_vocab.txt" }, "remove_spaces": { "string_value": "False" }, "language_code": { "string_value": "en-US" }, "eos_token": { "string_value": "[SEP]" }, "punctuation_mapping_path": { "string_value": "/data/models/riva-punctuation-en-US/1/bf74918539724a61a0d7703134519ea5_punct_label_ids.csv" }, "unk_token": { "string_value": "[UNK]" }, "delimiter": { "string_value": " " }, "capitalization_mapping_path": { "string_value": "/data/models/riva-punctuation-en-US/1/56633d0a0d8e459b9c8acd572cfa34b8_capit_label_ids.csv" }, "input_ids_tensor_name": { "string_value": "input_ids" }, "preserve_accents": { "string_value": "false" }, "tokenizer": { "string_value": "wordpiece" }, "unicode_normalize": { "string_value": "False" }, "punct_logits_tensor_name": { "string_value": "punct_logits" }, "use_int64_nn_inputs": { "string_value": "False" }, "bos_token": { "string_value": "[CLS]" } }, "model_warmup": [] } E0324 18:14:03.963356 19 model_lifecycle.cc:596] failed to load 'riva-trt-hifigan-English-US' version 1: Not found: unable to load shared library: /lib/aarch64-linux-gnu/libstdc++.so.6: versionGLIBCXX_3.4.29' not found (required by /usr/lib/aarch64-linux-gnu/nvidia/libnvdla_compiler.so) E0324 18:14:03.964964 19 model_lifecycle.cc:596] failed to load 'riva-trt-riva-punctuation-en-US-nn-bert-base-uncased' version 1: Not found: unable to load shared library: /lib/aarch64-linux-gnu/libstdc++.so.6: version GLIBCXX_3.4.29' not found (required by /usr/lib/aarch64-linux-gnu/nvidia/libnvdla_compiler.so) E0324 18:14:03.966507 19 model_lifecycle.cc:596] failed to load 'riva-trt-riva_intent_misty-nn-bert-base-uncased' version 1: Not found: unable to load shared library: /lib/aarch64-linux-gnu/libstdc++.so.6: versionGLIBCXX_3.4.29' not found (required by /usr/lib/aarch64-linux-gnu/nvidia/libnvdla_compiler.so) E0324 18:14:03.967139 19 model_lifecycle.cc:596] failed to load 'spectrogram_chunker-English-US' version 1: Invalid argument: instance group spectrogram_chunker-English-US_0 of model spectrogram_chunker-English-US has kind KIND_GPU but no GPUs are available E0324 18:14:03.969340 19 model_lifecycle.cc:596] failed to load 'tts_postprocessor-English-US' version 1: Not found: unable to load shared library: /lib/aarch64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /usr/lib/aarch64-linux-gnu/nvidia/libnvdla_compiler.so) E0324 18:14:03.975181 19 model_lifecycle.cc:596] failed to load 'tts_preprocessor-English-US' version 1: Invalid argument: instance group tts_preprocessor-English-US_0 of model tts_preprocessor-English-US has kind KIND_GPU but no GPUs are available I0324 18:14:03.975322 19 pipeline_library.cc:28] TRITONBACKEND_ModelInstanceInitialize: riva-punctuation-en-US_0 (device 0) cudaError_t 35 : "CUDA driver version is insufficient for CUDA runtime version" returned from 'cudaHostRegister( pinned_host_punctbuffer.data(), pinned_host_punctbuffer.size() sizeof(float), 0)' in fileriva/nlp/pipeline/punctuator/punctuator.cc line 158' cudaError_t 35 : "CUDA driver version is insufficient for CUDA runtime version" returned from 'cudaHostRegister( pinned_host_capitbuffer.data(), pinned_host_capitbuffer.size() sizeof(float), 0)' in fileriva/nlp/pipeline/punctuator/punctuator.cc line 160' cudaError_t 35 : "CUDA driver version is insufficient for CUDA runtime version" returned from 'cudaSetDevice(device_id)' in file./riva/pipeline/pipeline.h line 52' I0324 18:14:04.004839 19 model_lifecycle.cc:693] successfully loaded 'riva-punctuation-en-US' version 1 E0324 18:14:04.005023 19 model_repository_manager.cc:481] Invalid argument: ensemble 'conformer-en-US-asr-streaming' depends on 'riva-trt-conformer-en-US-asr-streaming-am-streaming' which has no loaded version E0324 18:14:04.005040 19 model_repository_manager.cc:481] Invalid argument: ensemble 'fastpitch_hifigan_ensemble-English-US' depends on 'tts_postprocessor-English-US' which has no loaded version E0324 18:14:04.005047 19 model_repository_manager.cc:481] Invalid argument: ensemble 'riva_intent_misty' depends on 'riva-trt-riva_intent_misty-nn-bert-base-uncased' which has no loaded version I0324 18:14:04.005108 19 server.cc:563] +------------------+------+ | Repository Agent | Path | +------------------+------+ +------------------+------+

I0324 18:14:04.005307 19 server.cc:590] +-----------------------+-------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+ Backend Path Config

+-----------------------+-------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+ | riva_asr_decoder | /opt/tritonserver/backends/riva_asr_decoder/libtriton_riva_asr_decoder.so | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"5.300000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} | | riva_nlp_detokenizer | /opt/tritonserver/backends/riva_nlp_detokenizer/libtriton_riva_nlp_detokenizer.so | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"5.300000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} | | riva_asr_endpointing | /opt/tritonserver/backends/riva_asr_endpointing/libtriton_riva_asr_endpointing.so | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"5.300000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} | | onnxruntime | /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"5.300000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} | | riva_nlp_tokenizer | /opt/tritonserver/backends/riva_nlp_tokenizer/libtriton_riva_nlp_tokenizer.so | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"5.300000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} | | riva_asr_features | /opt/tritonserver/backends/riva_asr_features/libtriton_riva_asr_features.so | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"5.300000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} | | riva_tts_preprocessor | /opt/tritonserver/backends/riva_tts_preprocessor/libtriton_riva_tts_preprocessor.so | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"5.300000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} | | riva_nlp_pipeline | /opt/tritonserver/backends/riva_nlp_pipeline/libtriton_riva_nlp_pipeline.so | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"5.300000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} | | riva_nlp_seqlabel | /opt/tritonserver/backends/riva_nlp_seqlabel/libtriton_riva_nlp_seqlabel.so | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"5.300000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} | | riva_tts_chunker | /opt/tritonserver/backends/riva_tts_chunker/libtriton_riva_tts_chunker.so | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"5.300000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} | +-----------------------+-------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0324 18:14:04.005534 19 server.cc:633] +-----------------------------------------------------------+---------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Model | Version | Status

                                                       |

+-----------------------------------------------------------+---------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming | 1 | READY

                                                       |

| conformer-en-US-asr-streaming-endpointing-streaming | 1 | READY

                                                       |

| conformer-en-US-asr-streaming-feature-extractor-streaming | 1 | UNAVAILABLE: Invalid argument: instance group conformer-en-US-asr-streaming-feature-extractor-streaming_0 of model conformer-en-US-asr-streaming-feature-extractor-streaming has kind KIND_GPU but no GPUs are available | | intent_slot_detokenizer | 1 | READY

                                                       |

| intent_slot_label_tokens_misty | 1 | READY

                                                       |

| intent_slot_tokenizer-en-US-misty | 1 | READY

                                                       |

| riva-onnx-fastpitch_encoder-English-US | 1 | UNAVAILABLE: Invalid argument: instance group riva-onnx-fastpitch_encoder-English-US_0 of model riva-onnx-fastpitch_encoder-English-US has kind KIND_GPU but no GPUs are available | | riva-punctuation-en-US | 1 | READY

                                                       |

| riva-trt-conformer-en-US-asr-streaming-am-streaming | 1 | UNAVAILABLE: Not found: unable to load shared library: /lib/aarch64-linux-gnu/libstdc++.so.6: version GLIBCXX_3.4.29' not found (required by /usr/lib/aarch64-linux-gnu/nvidia/libnvdla_compiler.so) | | riva-trt-hifigan-English-US | 1 | UNAVAILABLE: Not found: unable to load shared library: /lib/aarch64-linux-gnu/libstdc++.so.6: versionGLIBCXX_3.4.29' not found (required by /usr/lib/aarch64-linux-gnu/nvidia/libnvdla_compiler.so) | | riva-trt-riva-punctuation-en-US-nn-bert-base-uncased | 1 | UNAVAILABLE: Not found: unable to load shared library: /lib/aarch64-linux-gnu/libstdc++.so.6: version GLIBCXX_3.4.29' not found (required by /usr/lib/aarch64-linux-gnu/nvidia/libnvdla_compiler.so) | | riva-trt-riva_intent_misty-nn-bert-base-uncased | 1 | UNAVAILABLE: Not found: unable to load shared library: /lib/aarch64-linux-gnu/libstdc++.so.6: versionGLIBCXX_3.4.29' not found (required by /usr/lib/aarch64-linux-gnu/nvidia/libnvdla_compiler.so) | | spectrogram_chunker-English-US | 1 | UNAVAILABLE: Invalid argument: instance group spectrogram_chunker-English-US_0 of model spectrogram_chunker-English-US has kind KIND_GPU but no GPUs are available | | tts_postprocessor-English-US | 1 | UNAVAILABLE: Not found: unable to load shared library: /lib/aarch64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /usr/lib/aarch64-linux-gnu/nvidia/libnvdla_compiler.so) | | tts_preprocessor-English-US | 1 | UNAVAILABLE: Invalid argument: instance group tts_preprocessor-English-US_0 of model tts_preprocessor-English-US has kind KIND_GPU but no GPUs are available | +-----------------------------------------------------------+---------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

W0324 18:14:04.005631 19 metrics.cc:354] No polling metrics (CPU, GPU, Cache) are enabled. Will not poll for them. I0324 18:14:04.005809 19 tritonserver.cc:2264] +----------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Option | Value

|

+----------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | server_id | triton

|

| server_version | 2.27.0

|

| server_extensions | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics trace logging | | model_repository_path[0] | /data/models

|

| model_control_mode | MODE_NONE

|

| strict_model_config | 1

|

| rate_limit | OFF

|

| pinned_memory_pool_byte_size | 268435456

|

| cuda_memory_pool_byte_size{0} | 1000000000

|

| response_cache_byte_size | 0

|

| min_supported_compute_capability | 5.3

|

| strict_readiness | 1

|

| exit_timeout | 30

|

+----------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0324 18:14:04.005879 19 server.cc:264] Waiting for in-flight requests to complete. I0324 18:14:04.005953 19 server.cc:280] Timeout 30: Found 0 model versions that have in-flight inferences I0324 18:14:04.006321 19 server.cc:295] All models are stopped, unloading models I0324 18:14:04.006402 19 server.cc:302] Timeout 30: Found 6 live models and 0 in-flight non-inference requests I0324 18:14:04.008310 19 pipeline_library.cc:31] TRITONBACKEND_ModelInstanceFinalize: delete instance state cudaError_t 35 : "CUDA driver version is insufficient for CUDA runtime version" returned from 'cudaHostUnregister(pinned_host_punctbuffer.data())' in fileriva/nlp/pipeline/punctuator/punctuator.cc line 167' cudaError_t 35 : "CUDA driver version is insufficient for CUDA runtime version" returned from 'cudaHostUnregister(pinned_host_capitbuffer.data())' in fileriva/nlp/pipeline/punctuator/punctuator.cc line 168' I0324 18:14:04.008627 19 sequence_label_cbe.cc:142] TRITONBACKEND_ModelInstanceFinalize: delete instance state I0324 18:14:04.008697 19 sequence_label_cbe.cc:138] TRITONBACKEND_ModelFinalize: delete model state I0324 18:14:04.008732 19 ctc-decoder-library.cc:27] TRITONBACKEND_ModelInstanceFinalize: delete instance state I0324 18:14:04.008832 19 detokenizer_cbe.cc:150] TRITONBACKEND_ModelInstanceFinalize: delete instance state I0324 18:14:04.008845 19 tokenizer_library.cc:27] TRITONBACKEND_ModelInstanceFinalize: delete instance state I0324 18:14:04.008906 19 detokenizer_cbe.cc:146] TRITONBACKEND_ModelFinalize: delete model state I0324 18:14:04.008666 19 endpointing_library.cc:28] TRITONBACKEND_ModelInstanceFinalize: delete instance state I0324 18:14:04.009107 19 model_lifecycle.cc:578] successfully unloaded 'intent_slot_detokenizer' version 1 I0324 18:14:04.013533 19 pipeline_library.cc:27] TRITONBACKEND_ModelFinalize: delete model state I0324 18:14:04.014170 19 endpointing_library.cc:23] TRITONBACKEND_ModelFinalize: delete model state I0324 18:14:04.014555 19 tokenizer_library.cc:22] TRITONBACKEND_ModelFinalize: delete model state I0324 18:14:04.008742 19 model_lifecycle.cc:578] successfully unloaded 'intent_slot_label_tokens_misty' version 1 I0324 18:14:04.014816 19 model_lifecycle.cc:578] successfully unloaded 'conformer-en-US-asr-streaming-endpointing-streaming' version 1 I0324 18:14:04.015449 19 model_lifecycle.cc:578] successfully unloaded 'riva-punctuation-en-US' version 1 I0324 18:14:04.022604 19 model_lifecycle.cc:578] successfully unloaded 'intent_slot_tokenizer-en-US-misty' version 1 I0324 18:14:04.945356 19 ctc-decoder-library.cc:24] TRITONBACKEND_ModelFinalize: delete model state I0324 18:14:04.946123 19 model_lifecycle.cc:578] successfully unloaded 'conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming' version 1 I0324 18:14:05.006539 19 server.cc:302] Timeout 29: Found 0 live models and 0 in-flight non-inference requests error: creating server: Internal - failed to load all models E0324 18:14:09.729276 21 model_registry.cc:286] error: unable to get server status: failed to connect to all addresses; last error: UNKNOWN: ipv4:127.0.0.1:8001: Failed to connect to remote host: Connection refused One of the processes has exited unexpectedly. Stopping container. W0324 18:14:09.747316 21 riva_server.cc:196] Signal: 15

dusty-nv commented 3 months ago

Sorry yes the riva-client is fine for JP6, but the riva-embedded server containers from NGC are not out yet for JP6 (should be soon). So on JP6 you would run the riva server somewhere else or on JP5 instance. Or use whisper/XTTS for now

From: jasonthenderson @.> Sent: Sunday, March 24, 2024 2:35 PM To: dusty-nv/jetson-containers @.> Cc: Subscribed @.***> Subject: [dusty-nv/jetson-containers] RIVA doesn't seem to work with DP6 on Orin AGX (Issue #450)

Poking around I see that Riva says it is only supported up to 5.1, but there are examples in these containers using it and these containers all work with DP6 so I've been trying to no avail, including doing a full reflash and reinstall. It would be helpful for others if it were clearly noted that it doesn't work with 6 DP so they don't spend time trying to get it to work.

This btw is the error I get.... /opt/riva/bin/start-riva: line 10: curl: command not found /opt/riva/bin/start-riva: line 11: [: -ne: unary operator expected

Triton server is ready... W0324 18:13:59.715973 19 pinned_memory_manager.cc:236] Unable to allocate pinned system memory, pinned memory pool will not be available: CUDA driver version is insufficient for CUDA runtime version I0324 18:13:59.716351 19 cuda_memory_manager.cc:115] CUDA memory pool disabled I0324 18:13:59.724976 19 model_lifecycle.cc:459] loading: conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming:1 I0324 18:13:59.725089 19 model_lifecycle.cc:459] loading: conformer-en-US-asr-streaming-endpointing-streaming:1 I0324 18:13:59.725161 19 model_lifecycle.cc:459] loading: conformer-en-US-asr-streaming-feature-extractor-streaming:1 I0324 18:13:59.725299 19 model_lifecycle.cc:459] loading: intent_slot_detokenizer:1I0324 18:13:59.725239 21 riva_server.cc:126] Using Insecure Server Credentials

I0324 18:13:59.725618 19 model_lifecycle.cc:459] loading: intent_slot_label_tokens_misty:1 I0324 18:13:59.726326 19 model_lifecycle.cc:459] loading: intent_slot_tokenizer-en-US-misty:1 I0324 18:13:59.726721 19 model_lifecycle.cc:459] loading: riva-onnx-fastpitch_encoder-English-US:1 E0324 18:13:59.729030 21 model_registry.cc:286] error: unable to get server status: failed to connect to all addresses; last error: UNKNOWN: ipv4:127.0.0.1:8001: Failed to connect to remote host: Connection refused I0324 18:13:59.732746 19 model_lifecycle.cc:459] loading: riva-punctuation-en-US:1 I0324 18:13:59.734263 19 model_lifecycle.cc:459] loading: riva-trt-conformer-en-US-asr-streaming-am-streaming:1 I0324 18:13:59.734370 19 model_lifecycle.cc:459] loading: riva-trt-hifigan-English-US:1 I0324 18:13:59.734503 19 model_lifecycle.cc:459] loading: riva-trt-riva-punctuation-en-US-nn-bert-base-uncased:1 I0324 18:13:59.734635 19 model_lifecycle.cc:459] loading: riva-trt-riva_intent_misty-nn-bert-base-uncased:1 I0324 18:13:59.734767 19 model_lifecycle.cc:459] loading: spectrogram_chunker-English-US:1 I0324 18:13:59.734963 19 model_lifecycle.cc:459] loading: tts_postprocessor-English-US:1 I0324 18:13:59.735173 19 model_lifecycle.cc:459] loading: tts_preprocessor-English-US:1 I0324 18:14:00.389257 19 endpointing_library.cc:20] TRITONBACKEND_ModelInitialize: conformer-en-US-asr-streaming-endpointing-streaming (version 1) E0324 18:14:00.389530 19 model_lifecycle.cc:596] failed to load 'conformer-en-US-asr-streaming-feature-extractor-streaming' version 1: Invalid argument: instance group conformer-en-US-asr-streaming-feature-extractor-streaming_0 of model conformer-en-US-asr-streaming-feature-extractor-streaming has kind KIND_GPU but no GPUs are available WARNING: Logging before InitGoogleLogging() is written to STDERR W0324 18:14:00.390765 24 parameter_parser.cc:146] Parameter 'chunk_size' set but unused. W0324 18:14:00.390797 24 parameter_parser.cc:146] Parameter 'ms_per_timestep' set but unused. W0324 18:14:00.390801 24 parameter_parser.cc:146] Parameter 'residue_blanks_at_end' set but unused. W0324 18:14:00.390805 24 parameter_parser.cc:146] Parameter 'residue_blanks_at_start' set but unused. W0324 18:14:00.390918 24 parameter_parser.cc:146] Parameter 'start_history' set but unused. W0324 18:14:00.390960 24 parameter_parser.cc:146] Parameter 'start_th' set but unused. W0324 18:14:00.390986 24 parameter_parser.cc:146] Parameter 'stop_history' set but unused. W0324 18:14:00.391013 24 parameter_parser.cc:146] Parameter 'stop_th' set but unused. W0324 18:14:00.391048 24 parameter_parser.cc:146] Parameter 'streaming' set but unused. W0324 18:14:00.391079 24 parameter_parser.cc:146] Parameter 'use_subword' set but unused. W0324 18:14:00.391106 24 parameter_parser.cc:146] Parameter 'vocab_file' set but unused. I0324 18:14:00.392192 19 backend_model.cc:303] model configuration: { "name": "conformer-en-US-asr-streaming-endpointing-streaming", "platform": "", "backend": "riva_asr_endpointing", "version_policy": { "latest": { "num_versions": 1 } }, "max_batch_size": 1, "input": [ { "name": "CLASS_LOGITS", "data_type": "TYPE_FP32", "format": "FORMAT_NONE", "dims": [ -1, 257 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false } ], "output": [ { "name": "SEGMENTS_START_END", "data_type": "TYPE_INT32", "dims": [ -1, 2 ], "label_filename": "", "is_shape_tensor": false } ], "batch_input": [], "batch_output": [], "optimization": { "graph": { "level": 0 }, "priority": "PRIORITY_DEFAULT", "cuda": { "graphs": false, "busy_wait_events": false, "graph_spec": [], "output_copy_stream": true }, "input_pinned_memory": { "enable": true }, "output_pinned_memory": { "enable": true }, "gather_kernel_buffer_threshold": 0, "eager_batching": false }, "sequence_batching": { "max_sequence_idle_microseconds": 60000000, "control_input": [ { "name": "START", "control": [ { "kind": "CONTROL_SEQUENCE_START", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "READY", "control": [ { "kind": "CONTROL_SEQUENCE_READY", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] } ], "state": [] }, "instance_group": [ { "name": "conformer-en-US-asr-streaming-endpointing-streaming_0", "kind": "KIND_CPU", "count": 1, "gpus": [], "secondary_devices": [], "profile": [], "passive": false, "host_policy": "" } ], "default_model_filename": "", "cc_model_filenames": {}, "metric_tags": {}, "parameters": { "streaming": { "string_value": "True" }, "residue_blanks_at_start": { "string_value": "-2" }, "stop_th": { "string_value": "0.98" }, "start_th": { "string_value": "0.2" }, "vocab_file": { "string_value": "/data/models/conformer-en-US-asr-streaming-endpointing-streaming/1/riva_decoder_vocabulary.txt" }, "ms_per_timestep": { "string_value": "40" }, "endpointing_type": { "string_value": "greedy_ctc" }, "stop_history": { "string_value": "800" }, "residue_blanks_at_end": { "string_value": "0" }, "use_subword": { "string_value": "True" }, "start_history": { "string_value": "200" }, "chunk_size": { "string_value": "0.16" } }, "model_warmup": [], "model_transaction_policy": { "decoupled": false } } I0324 18:14:00.392402 19 endpointing_library.cc:24] TRITONBACKEND_ModelInstanceInitialize: conformer-en-US-asr-streaming-endpointing-streaming_0 (device 0) I0324 18:14:00.394024 19 detokenizer_cbe.cc:145] TRITONBACKEND_ModelInitialize: intent_slot_detokenizer (version 1) I0324 18:14:00.394920 19 backend_model.cc:303] model configuration: { "name": "intent_slot_detokenizer", "platform": "", "backend": "riva_nlp_detokenizer", "version_policy": { "latest": { "num_versions": 1 } }, "max_batch_size": 1, "input": [ { "name": "IN_TOKEN_LABELS0", "data_type": "TYPE_STRING", "format": "FORMAT_NONE", "dims": [ -1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "IN_TOKEN_SCORES__1", "data_type": "TYPE_FP32", "format": "FORMAT_NONE", "dims": [ -1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "IN_SEQ_LEN2", "data_type": "TYPE_INT64", "format": "FORMAT_NONE", "dims": [ 1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "IN_TOK_STR3", "data_type": "TYPE_STRING", "format": "FORMAT_NONE", "dims": [ -1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false } ], "output": [ { "name": "OUT_TOKEN_LABELS__0", "data_type": "TYPE_STRING", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "OUT_TOKEN_SCORES1", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "OUT_SEQ_LEN2", "data_type": "TYPE_INT64", "dims": [ 1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "OUT_TOK_STR3", "data_type": "TYPE_STRING", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false } ], "batch_input": [], "batch_output": [], "optimization": { "graph": { "level": 0 }, "priority": "PRIORITY_DEFAULT", "cuda": { "graphs": false, "busy_wait_events": false, "graph_spec": [], "output_copy_stream": true }, "input_pinned_memory": { "enable": true }, "output_pinned_memory": { "enable": true }, "gather_kernel_buffer_threshold": 0, "eager_batching": false }, "instance_group": [ { "name": "intent_slot_detokenizer_0", "kind": "KIND_CPU", "count": 1, "gpus": [], "secondary_devices": [], "profile": [], "passive": false, "host_policy": "" } ], "default_model_filename": "", "cc_model_filenames": {}, "metric_tags": {}, "parameters": {}, "model_warmup": [], "model_transaction_policy": { "decoupled": false } } I0324 18:14:00.395170 19 detokenizer_cbe.cc:147] TRITONBACKEND_ModelInstanceInitialize: intent_slot_detokenizer_0 (device 0) I0324 18:14:00.396184 19 ctc-decoder-library.cc:21] TRITONBACKEND_ModelInitialize: conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming (version 1) W0324 18:14:00.396492 19 pinned_memory_manager.cc:133] failed to allocate pinned system memory: no pinned memory pool, falling back to non-pinned system memory I0324 18:14:00.396833 19 model_lifecycle.cc:693] successfully loaded 'conformer-en-US-asr-streaming-endpointing-streaming' version 1 WARNING: Logging before InitGoogleLogging() is written to STDERR I0324 18:14:00.397576 19 model_lifecycle.cc:693] successfully loaded 'intent_slot_detokenizer' version 1 W0324 18:14:00.397578 23 parameter_parser.cc:146] Parameter 'append_space_to_transcripts' set but unused. W0324 18:14:00.398144 23 parameter_parser.cc:146] Parameter 'beam_size' set but unused. W0324 18:14:00.398149 23 parameter_parser.cc:146] Parameter 'beam_size_token' set but unused. W0324 18:14:00.398152 23 parameter_parser.cc:146] Parameter 'beam_threshold' set but unused. W0324 18:14:00.398155 23 parameter_parser.cc:146] Parameter 'blank_token' set but unused. W0324 18:14:00.398159 23 parameter_parser.cc:146] Parameter 'cased' set but unused. W0324 18:14:00.398162 23 parameter_parser.cc:146] Parameter 'decoder_num_worker_threads' set but unused. W0324 18:14:00.398165 23 parameter_parser.cc:146] Parameter 'forerunner_beam_size' set but unused. W0324 18:14:00.398169 23 parameter_parser.cc:146] Parameter 'forerunner_beam_size_token' set but unused. W0324 18:14:00.398171 23 parameter_parser.cc:146] Parameter 'forerunner_beam_threshold' set but unused. W0324 18:14:00.398175 23 parameter_parser.cc:146] Parameter 'forerunner_use_lm' set but unused. W0324 18:14:00.398178 23 parameter_parser.cc:146] Parameter 'language_model_file' set but unused. W0324 18:14:00.398182 23 parameter_parser.cc:146] Parameter 'lexicon_file' set but unused. W0324 18:14:00.398185 23 parameter_parser.cc:146] Parameter 'lm_weight' set but unused. W0324 18:14:00.398190 23 parameter_parser.cc:146] Parameter 'log_add' set but unused. W0324 18:14:00.398195 23 parameter_parser.cc:146] Parameter 'max_execution_batch_size' set but unused. W0324 18:14:00.398200 23 parameter_parser.cc:146] Parameter 'max_supported_transcripts' set but unused. W0324 18:14:00.398203 23 parameter_parser.cc:146] Parameter 'num_tokenization' set but unused. W0324 18:14:00.398206 23 parameter_parser.cc:146] Parameter 'profane_words_file' set but unused. W0324 18:14:00.398211 23 parameter_parser.cc:146] Parameter 'return_separate_utterances' set but unused. W0324 18:14:00.398216 23 parameter_parser.cc:146] Parameter 'set_default_index_to_unk_token' set but unused. W0324 18:14:00.398219 23 parameter_parser.cc:146] Parameter 'sil_token' set but unused. W0324 18:14:00.398223 23 parameter_parser.cc:146] Parameter 'smearing_mode' set but unused. W0324 18:14:00.398226 23 parameter_parser.cc:146] Parameter 'tokenizer_model' set but unused. W0324 18:14:00.398231 23 parameter_parser.cc:146] Parameter 'unk_score' set but unused. W0324 18:14:00.398234 23 parameter_parser.cc:146] Parameter 'unk_token' set but unused. W0324 18:14:00.398238 23 parameter_parser.cc:146] Parameter 'use_lexicon_free_decoding' set but unused. W0324 18:14:00.398242 23 parameter_parser.cc:146] Parameter 'vocab_file' set but unused. W0324 18:14:00.398247 23 parameter_parser.cc:146] Parameter 'word_insertion_score' set but unused. I0324 18:14:00.399717 19 backend_model.cc:303] model configuration: { "name": "conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming", "platform": "", "backend": "riva_asr_decoder", "version_policy": { "latest": { "num_versions": 1 } }, "max_batch_size": 1024, "input": [ { "name": "CLASS_LOGITS", "data_type": "TYPE_FP32", "format": "FORMAT_NONE", "dims": [ -1, 257 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "END_FLAG", "data_type": "TYPE_UINT32", "format": "FORMAT_NONE", "dims": [ 1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "SEGMENTS_START_END", "data_type": "TYPE_INT32", "format": "FORMAT_NONE", "dims": [ -1, 2 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false }, { "name": "CUSTOM_CONFIGURATION", "data_type": "TYPE_STRING", "format": "FORMAT_NONE", "dims": [ -1, 2 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false } ], "output": [ { "name": "FINAL_TRANSCRIPTS", "data_type": "TYPE_STRING", "dims": [ -1, -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "FINAL_TRANSCRIPTS_SCORE", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "FINAL_WORDS_START_END", "data_type": "TYPE_INT32", "dims": [ -1, 2 ], "label_filename": "", "is_shape_tensor": false }, { "name": "PARTIAL_TRANSCRIPTS", "data_type": "TYPE_STRING", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "PARTIAL_TRANSCRIPTS_STABILITY", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "PARTIAL_WORDS_START_END", "data_type": "TYPE_INT32", "dims": [ -1, 2 ], "label_filename": "", "is_shape_tensor": false }, { "name": "FINAL_WORDS_SCORE", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "PARTIAL_WORDS_SCORE", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false } ], "batch_input": [], "batch_output": [], "optimization": { "graph": { "level": 0 }, "priority": "PRIORITY_DEFAULT", "cuda": { "graphs": false, "busy_wait_events": false, "graph_spec": [], "output_copy_stream": true }, "input_pinned_memory": { "enable": true }, "output_pinned_memory": { "enable": true }, "gather_kernel_buffer_threshold": 0, "eager_batching": false }, "sequence_batching": { "oldest": { "max_candidate_sequences": 1024, "preferred_batch_size": [ 32, 64 ], "max_queue_delay_microseconds": 1000 }, "max_sequence_idle_microseconds": 60000000, "control_input": [ { "name": "START", "control": [ { "kind": "CONTROL_SEQUENCE_START", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "READY", "control": [ { "kind": "CONTROL_SEQUENCE_READY", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "END", "control": [ { "kind": "CONTROL_SEQUENCE_END", "int32_false_true": [ 0, 1 ], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_INVALID" } ] }, { "name": "CORRID", "control": [ { "kind": "CONTROL_SEQUENCE_CORRID", "int32_false_true": [], "fp32_false_true": [], "bool_false_true": [], "data_type": "TYPE_UINT64" } ] } ], "state": [] }, "instance_group": [ { "name": "conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming_0", "kind": "KIND_CPU", "count": 1, "gpus": [], "secondary_devices": [], "profile": [], "passive": false, "host_policy": "" } ], "default_model_filename": "", "cc_model_filenames": {}, "metric_tags": {}, "parameters": { "beam_size_token": { "string_value": "16" }, "use_lexicon_free_decoding": { "string_value": "False" }, "forerunner_beam_size": { "string_value": "8" }, "log_add": { "string_value": "True" }, "force_decoder_reset_after_ms": { "string_value": "-1" }, "num_tokenization": { "string_value": "1" }, "language_model_file": { "string_value": "/data/models/conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming/1/4gram-pruned-0_2_7_9-en-lm-set-1.0.bin" }, "forerunner_beam_threshold": { "string_value": "10.0" }, "return_separate_utterances": { "string_value": "False" }, "set_default_index_to_unk_token": { "string_value": "False" }, "word_insertion_score": { "string_value": "1.0" }, "use_subword": { "string_value": "True" }, "tokenizer_model": { "string_value": "/data/models/conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming/1/8b8f095152034e98b24ab33726708bd0_tokenizer.model" }, "max_execution_batch_size": { "string_value": "1" }, "unk_token": { "string_value": "" }, "right_padding_size": { "string_value": "1.92" }, "beam_size": { "string_value": "32" }, "forerunner_beam_size_token": { "string_value": "8" }, "profane_words_file": { "string_value": "/data/models/conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming/1/profane_words_file.txt" }, "lexicon_file": { "string_value": "/data/models/conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming/1/lexicon.txt" }, "append_space_to_transcripts": { "string_value": "True" }, "vocab_file": { "string_value": "/data/models/conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming/1/riva_decoder_vocabulary.txt" }, "smearing_mode": { "string_value": "max" }, "decoder_type": { "string_value": "flashlight" }, "unk_score": { "string_value": "-inf" }, "lm_weight": { "string_value": "0.8" }, "asr_model_delay": { "string_value": "-1" }, "beam_threshold": { "string_value": "20.0" }, "blank_token": { "string_value": "#" }, "ms_per_timestep": { "string_value": "40" }, "max_supported_transcripts": { "string_value": "1" }, "decoder_num_worker_threads": { "string_value": "-1" }, "left_padding_size": { "string_value": "1.92" }, "cased": { "string_value": "False" }, "streaming": { "string_value": "True" }, "chunk_size": { "string_value": "0.16" }, "sil_token": { "string_value": "▁" }, "forerunner_use_lm": { "string_value": "true" } }, "model_warmup": [], "model_transaction_policy": { "decoupled": false } } I0324 18:14:00.409563 19 sequence_label_cbe.cc:137] TRITONBACKEND_ModelInitialize: intent_slot_label_tokens_misty (version 1) I0324 18:14:00.410388 19 backend_model.cc:303] model configuration: { "name": "intent_slot_label_tokens_misty", "platform": "", "backend": "riva_nlp_seqlabel", "version_policy": { "latest": { "num_versions": 1 } }, "max_batch_size": 1, "input": [ { "name": "TOKEN_LOGIT1", "data_type": "TYPE_FP32", "format": "FORMAT_NONE", "dims": [ -1, 65 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false } ], "output": [ { "name": "TOKEN_LABELS0", "data_type": "TYPE_STRING", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "TOKEN_SCORES1", "data_type": "TYPE_FP32", "dims": [ -1 ], "label_filename": "", "is_shape_tensor": false } ], "batch_input": [], "batch_output": [], "optimization": { "priority": "PRIORITY_DEFAULT", "input_pinned_memory": { "enable": true }, "output_pinned_memory": { "enable": true }, "gather_kernel_buffer_threshold": 0, "eager_batching": false }, "instance_group": [ { "name": "intent_slot_label_tokens_misty_0", "kind": "KIND_CPU", "count": 1, "gpus": [], "secondary_devices": [], "profile": [], "passive": false, "host_policy": "" } ], "default_model_filename": "", "cc_model_filenames": {}, "metric_tags": {}, "parameters": { "classes": { "string_value": "/data/models/intent_slot_label_tokens_misty/1/slot_labels.csv" } }, "model_warmup": [] } I0324 18:14:00.410581 19 sequence_label_cbe.cc:139] TRITONBACKEND_ModelInstanceInitialize: intent_slot_label_tokens_misty_0 (device 0) I0324 18:14:00.410847 19 tokenizer_library.cc:20] TRITONBACKEND_ModelInitialize: intent_slot_tokenizer-en-US-misty (version 1) WARNING: Logging before InitGoogleLogging() is written to STDERR W0324 18:14:00.411520 28 parameter_parser.cc:146] Parameter 'unk_token' set but unused. W0324 18:14:00.411599 28 parameter_parser.cc:146] Parameter 'vocab' set but unused. I0324 18:14:00.411709 19 backend_model.cc:303] model configuration: { "name": "intent_slot_tokenizer-en-US-misty", "platform": "", "backend": "riva_nlp_tokenizer", "version_policy": { "latest": { "num_versions": 1 } }, "max_batch_size": 1, "input": [ { "name": "INPUT_STR__0", "data_type": "TYPE_STRING", "format": "FORMAT_NONE", "dims": [ 1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false } ], "output": [ { "name": "SEQ0", "data_type": "TYPE_INT32", "dims": [ 128 ], "label_filename": "", "is_shape_tensor": false }, { "name": "MASK__1", "data_type": "TYPE_INT32", "dims": [ 128 ], "label_filename": "", "is_shape_tensor": false }, { "name": "SEGMENT4", "data_type": "TYPE_INT32", "dims": [ 128 ], "label_filename": "", "is_shape_tensor": false }, { "name": "SEQ_LEN2", "data_type": "TYPE_INT64", "dims": [ 1 ], "label_filename": "", "is_shape_tensor": false }, { "name": "TOK_STR__3", "data_type": "TYPE_STRING", "dims": [ 128 ], "label_filename": "", "is_shape_tensor": false } ], "batch_input": [], "batch_output": [], "optimization": { "priority": "PRIORITY_DEFAULT", "input_pinned_memory": { "enable": true }, "output_pinned_memory": { "enable": true }, "gather_kernel_buffer_threshold": 0, "eager_batching": false }, "instance_group": [ { "name": "intent_slot_tokenizer-en-US-misty_0", "kind": "KIND_CPU", "count": 1, "gpus": [], "secondary_devices": [], "profile": [], "passive": false, "host_policy": "" } ], "default_model_filename": "", "cc_model_filenames": {}, "metric_tags": {}, "parameters": { "task": { "string_value": "single_input" }, "tokenizer": { "string_value": "wordpiece" }, "eos_token": { "string_value": "[SEP]" }, "bos_token": { "string_value": "[CLS]" }, "pad_chars_with_spaces": { "string_value": "False" }, "vocab": { "string_value": "/data/models/intent_slot_tokenizer-en-US-misty/1/tokenizer.vocab_file" }, "to_lower": { "string_value": "true" }, "unk_token": { "string_value": "[UNK]" } }, "model_warmup": [] } I0324 18:14:00.411825 19 ctc-decoder-library.cc:25] TRITONBACKEND_ModelInstanceInitialize: conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming_0 (device 0) I0324 18:14:00.412924 19 model_lifecycle.cc:693] successfully loaded 'intent_slot_label_tokens_misty' version 1 I0324 18:14:03.905520 23 ctc-decoder.cc:179] Beam Decoder initialized successfully! I0324 18:14:03.908461 19 onnxruntime.cc:2459] TRITONBACKEND_Initialize: onnxruntime I0324 18:14:03.908613 19 onnxruntime.cc:2469] Triton TRITONBACKEND API version: 1.10 I0324 18:14:03.908682 19 onnxruntime.cc:2475] 'onnxruntime' TRITONBACKEND API version: 1.10 I0324 18:14:03.908747 19 onnxruntime.cc:2505] backend configuration: {"cmdline":{"auto-complete-config":"false","min-compute-capability":"5.300000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} I0324 18:14:03.924653 19 model_lifecycle.cc:693] successfully loaded 'conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming' version 1 I0324 18:14:03.926894 19 tokenizer_library.cc:23] TRITONBACKEND_ModelInstanceInitialize: intent_slot_tokenizer-en-US-misty_0 (device 0) E0324 18:14:03.930276 19 model_lifecycle.cc:596] failed to load 'riva-onnx-fastpitch_encoder-English-US' version 1: Invalid argument: instance group riva-onnx-fastpitch_encoder-English-US_0 of model riva-onnx-fastpitch_encoder-English-US has kind KIND_GPU but no GPUs are available I0324 18:14:03.946244 19 model_lifecycle.cc:693] successfully loaded 'intent_slot_tokenizer-en-US-misty' version 1 I0324 18:14:03.960481 19 pipeline_library.cc:24] TRITONBACKEND_ModelInitialize: riva-punctuation-en-US (version 1) E0324 18:14:03.960477 19 model_lifecycle.cc:596] failed to load 'riva-trt-conformer-en-US-asr-streaming-am-streaming' version 1: Not found: unable to load shared library: /lib/aarch64-linux-gnu/libstdc++.so.6: version GLIBCXX_3.4.29' not found (required by /usr/lib/aarch64-linux-gnu/nvidia/libnvdla_compiler.so) WARNING: Logging before InitGoogleLogging() is written to STDERR W0324 18:14:03.961179 30 parameter_parser.cc:146] Parameter 'attn_mask_tensor_name' set but unused. W0324 18:14:03.961200 30 parameter_parser.cc:146] Parameter 'bos_token' set but unused. W0324 18:14:03.961205 30 parameter_parser.cc:146] Parameter 'capit_logits_tensor_name' set but unused. W0324 18:14:03.961208 30 parameter_parser.cc:146] Parameter 'capitalization_mapping_path' set but unused. W0324 18:14:03.961211 30 parameter_parser.cc:146] Parameter 'delimiter' set but unused. W0324 18:14:03.961215 30 parameter_parser.cc:146] Parameter 'eos_token' set but unused. W0324 18:14:03.961217 30 parameter_parser.cc:146] Parameter 'input_ids_tensor_name' set but unused. W0324 18:14:03.961220 30 parameter_parser.cc:146] Parameter 'language_code' set but unused. W0324 18:14:03.961223 30 parameter_parser.cc:146] Parameter 'load_model' set but unused. W0324 18:14:03.961226 30 parameter_parser.cc:146] Parameter 'model_api' set but unused. W0324 18:14:03.961230 30 parameter_parser.cc:146] Parameter 'model_family' set but unused. W0324 18:14:03.961232 30 parameter_parser.cc:146] Parameter 'model_name' set but unused. W0324 18:14:03.961236 30 parameter_parser.cc:146] Parameter 'pad_chars_with_spaces' set but unused. W0324 18:14:03.961238 30 parameter_parser.cc:146] Parameter 'pipeline_type' set but unused. W0324 18:14:03.961241 30 parameter_parser.cc:146] Parameter 'preserve_accents' set but unused. W0324 18:14:03.961244 30 parameter_parser.cc:146] Parameter 'punct_logits_tensor_name' set but unused. W0324 18:14:03.961248 30 parameter_parser.cc:146] Parameter 'punctuation_mapping_path' set but unused. W0324 18:14:03.961251 30 parameter_parser.cc:146] Parameter 'remove_spaces' set but unused. W0324 18:14:03.961254 30 parameter_parser.cc:146] Parameter 'to_lower' set but unused. W0324 18:14:03.961257 30 parameter_parser.cc:146] Parameter 'token_type_tensor_name' set but unused. W0324 18:14:03.961261 30 parameter_parser.cc:146] Parameter 'tokenizer' set but unused. W0324 18:14:03.961263 30 parameter_parser.cc:146] Parameter 'tokenizer_to_lower' set but unused. W0324 18:14:03.961266 30 parameter_parser.cc:146] Parameter 'unicode_normalize' set but unused. W0324 18:14:03.961269 30 parameter_parser.cc:146] Parameter 'unk_token' set but unused. W0324 18:14:03.961272 30 parameter_parser.cc:146] Parameter 'use_int64_nn_inputs' set but unused. W0324 18:14:03.961275 30 parameter_parser.cc:146] Parameter 'vocab' set but unused. W0324 18:14:03.961333 30 parameter_parser.cc:146] Parameter 'attn_mask_tensor_name' set but unused. W0324 18:14:03.961341 30 parameter_parser.cc:146] Parameter 'bos_token' set but unused. W0324 18:14:03.961345 30 parameter_parser.cc:146] Parameter 'capit_logits_tensor_name' set but unused. W0324 18:14:03.961347 30 parameter_parser.cc:146] Parameter 'capitalization_mapping_path' set but unused. W0324 18:14:03.961350 30 parameter_parser.cc:146] Parameter 'delimiter' set but unused. W0324 18:14:03.961354 30 parameter_parser.cc:146] Parameter 'eos_token' set but unused. W0324 18:14:03.961356 30 parameter_parser.cc:146] Parameter 'input_ids_tensor_name' set but unused. W0324 18:14:03.961359 30 parameter_parser.cc:146] Parameter 'language_code' set but unused. W0324 18:14:03.961364 30 parameter_parser.cc:146] Parameter 'model_api' set but unused. W0324 18:14:03.961365 30 parameter_parser.cc:146] Parameter 'model_family' set but unused. W0324 18:14:03.961369 30 parameter_parser.cc:146] Parameter 'model_name' set but unused. W0324 18:14:03.961371 30 parameter_parser.cc:146] Parameter 'pad_chars_with_spaces' set but unused. W0324 18:14:03.961374 30 parameter_parser.cc:146] Parameter 'preserve_accents' set but unused. W0324 18:14:03.961377 30 parameter_parser.cc:146] Parameter 'punct_logits_tensor_name' set but unused. W0324 18:14:03.961380 30 parameter_parser.cc:146] Parameter 'punctuation_mapping_path' set but unused. W0324 18:14:03.961383 30 parameter_parser.cc:146] Parameter 'remove_spaces' set but unused. W0324 18:14:03.961386 30 parameter_parser.cc:146] Parameter 'to_lower' set but unused. W0324 18:14:03.961390 30 parameter_parser.cc:146] Parameter 'token_type_tensor_name' set but unused. W0324 18:14:03.961392 30 parameter_parser.cc:146] Parameter 'tokenizer' set but unused. W0324 18:14:03.961395 30 parameter_parser.cc:146] Parameter 'tokenizer_to_lower' set but unused. W0324 18:14:03.961398 30 parameter_parser.cc:146] Parameter 'unicode_normalize' set but unused. W0324 18:14:03.961401 30 parameter_parser.cc:146] Parameter 'unk_token' set but unused. W0324 18:14:03.961405 30 parameter_parser.cc:146] Parameter 'use_int64_nn_inputs' set but unused. W0324 18:14:03.961407 30 parameter_parser.cc:146] Parameter 'vocab' set but unused. W0324 18:14:03.961431 30 parameter_parser.cc:146] Parameter 'attn_mask_tensor_name' set but unused. W0324 18:14:03.961436 30 parameter_parser.cc:146] Parameter 'bos_token' set but unused. W0324 18:14:03.961441 30 parameter_parser.cc:146] Parameter 'capit_logits_tensor_name' set but unused. W0324 18:14:03.961442 30 parameter_parser.cc:146] Parameter 'capitalization_mapping_path' set but unused. W0324 18:14:03.961445 30 parameter_parser.cc:146] Parameter 'delimiter' set but unused. W0324 18:14:03.961448 30 parameter_parser.cc:146] Parameter 'eos_token' set but unused. W0324 18:14:03.961452 30 parameter_parser.cc:146] Parameter 'input_ids_tensor_name' set but unused. W0324 18:14:03.961454 30 parameter_parser.cc:146] Parameter 'language_code' set but unused. W0324 18:14:03.961457 30 parameter_parser.cc:146] Parameter 'model_api' set but unused. W0324 18:14:03.961460 30 parameter_parser.cc:146] Parameter 'model_family' set but unused. W0324 18:14:03.961463 30 parameter_parser.cc:146] Parameter 'model_name' set but unused. W0324 18:14:03.961467 30 parameter_parser.cc:146] Parameter 'pad_chars_with_spaces' set but unused. W0324 18:14:03.961469 30 parameter_parser.cc:146] Parameter 'preserve_accents' set but unused. W0324 18:14:03.961473 30 parameter_parser.cc:146] Parameter 'punct_logits_tensor_name' set but unused. W0324 18:14:03.961477 30 parameter_parser.cc:146] Parameter 'punctuation_mapping_path' set but unused. W0324 18:14:03.961479 30 parameter_parser.cc:146] Parameter 'remove_spaces' set but unused. W0324 18:14:03.961483 30 parameter_parser.cc:146] Parameter 'to_lower' set but unused. W0324 18:14:03.961485 30 parameter_parser.cc:146] Parameter 'token_type_tensor_name' set but unused. W0324 18:14:03.961488 30 parameter_parser.cc:146] Parameter 'tokenizer_to_lower' set but unused. W0324 18:14:03.961491 30 parameter_parser.cc:146] Parameter 'unicode_normalize' set but unused. W0324 18:14:03.961494 30 parameter_parser.cc:146] Parameter 'unk_token' set but unused. W0324 18:14:03.961498 30 parameter_parser.cc:146] Parameter 'use_int64_nn_inputs' set but unused. W0324 18:14:03.961500 30 parameter_parser.cc:146] Parameter 'vocab' set but unused. W0324 18:14:03.961565 30 parameter_parser.cc:146] Parameter 'model_api' set but unused. W0324 18:14:03.961572 30 parameter_parser.cc:146] Parameter 'model_family' set but unused. I0324 18:14:03.961635 19 backend_model.cc:303] model configuration: { "name": "riva-punctuation-en-US", "platform": "", "backend": "riva_nlp_pipeline", "version_policy": { "latest": { "num_versions": 1 } }, "max_batch_size": 1, "input": [ { "name": "PIPELINE_INPUT", "data_type": "TYPE_STRING", "format": "FORMAT_NONE", "dims": [ 1 ], "is_shape_tensor": false, "allow_ragged_batch": false, "optional": false } ], "output": [ { "name": "PIPELINE_OUTPUT", "data_type": "TYPE_STRING", "dims": [ 1 ], "label_filename": "", "is_shape_tensor": false } ], "batch_input": [], "batch_output": [], "optimization": { "priority": "PRIORITY_DEFAULT", "input_pinned_memory": { "enable": true }, "output_pinned_memory": { "enable": true }, "gather_kernel_buffer_threshold": 0, "eager_batching": false }, "instance_group": [ { "name": "riva-punctuation-en-US_0", "kind": "KIND_CPU", "count": 1, "gpus": [], "secondary_devices": [], "profile": [], "passive": false, "host_policy": "" } ], "default_model_filename": "", "cc_model_filenames": {}, "metric_tags": {}, "parameters": { "token_type_tensor_name": { "string_value": "token_type_ids" }, "to_lower": { "string_value": "true" }, "tokenizer_to_lower": { "string_value": "true" }, "model_name": { "string_value": "riva-trt-riva-punctuation-en-US-nn-bert-base-uncased" }, "attn_mask_tensor_name": { "string_value": "attention_mask" }, "model_api": { "string_value": "/nvidia.riva.nlp.RivaLanguageUnderstanding/PunctuateText" }, "load_model": { "string_value": "false" }, "pipeline_type": { "string_value": "punctuation" }, "model_family": { "string_value": "riva" }, "pad_chars_with_spaces": { "string_value": "False" }, "capit_logits_tensor_name": { "string_value": "capit_logits" }, "vocab": { "string_value": "/data/models/riva-punctuation-en-US/1/f92889b136d2433693cb9127e1aea218_vocab.txt" }, "remove_spaces": { "string_value": "False" }, "language_code": { "string_value": "en-US" }, "eos_token": { "string_value": "[SEP]" }, "punctuation_mapping_path": { "string_value": "/data/models/riva-punctuation-en-US/1/bf74918539724a61a0d7703134519ea5_punct_label_ids.csv" }, "unk_token": { "string_value": "[UNK]" }, "delimiter": { "string_value": " " }, "capitalization_mapping_path": { "string_value": "/data/models/riva-punctuation-en-US/1/56633d0a0d8e459b9c8acd572cfa34b8_capit_label_ids.csv" }, "input_ids_tensor_name": { "string_value": "input_ids" }, "preserve_accents": { "string_value": "false" }, "tokenizer": { "string_value": "wordpiece" }, "unicode_normalize": { "string_value": "False" }, "punct_logits_tensor_name": { "string_value": "punct_logits" }, "use_int64_nn_inputs": { "string_value": "False" }, "bos_token": { "string_value": "[CLS]" } }, "model_warmup": [] } E0324 18:14:03.963356 19 model_lifecycle.cc:596] failed to load 'riva-trt-hifigan-English-US' version 1: Not found: unable to load shared library: /lib/aarch64-linux-gnu/libstdc++.so.6: version GLIBCXX_3.4.29' not found (required by /usr/lib/aarch64-linux-gnu/nvidia/libnvdla_compiler.so) E0324 18:14:03.964964 19 model_lifecycle.cc:596] failed to load 'riva-trt-riva-punctuation-en-US-nn-bert-base-uncased' version 1: Not found: unable to load shared library: /lib/aarch64-linux-gnu/libstdc++.so.6: version GLIBCXX_3.4.29' not found (required by /usr/lib/aarch64-linux-gnu/nvidia/libnvdla_compiler.so) E0324 18:14:03.966507 19 model_lifecycle.cc:596] failed to load 'riva-trt-riva_intent_misty-nn-bert-base-uncased' version 1: Not found: unable to load shared library: /lib/aarch64-linux-gnu/libstdc++.so.6: version GLIBCXX_3.4.29' not found (required by /usr/lib/aarch64-linux-gnu/nvidia/libnvdla_compiler.so) E0324 18:14:03.967139 19 model_lifecycle.cc:596] failed to load 'spectrogram_chunker-English-US' version 1: Invalid argument: instance group spectrogram_chunker-English-US_0 of model spectrogram_chunker-English-US has kind KIND_GPU but no GPUs are available E0324 18:14:03.969340 19 model_lifecycle.cc:596] failed to load 'tts_postprocessor-English-US' version 1: Not found: unable to load shared library: /lib/aarch64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /usr/lib/aarch64-linux-gnu/nvidia/libnvdla_compiler.so) E0324 18:14:03.975181 19 model_lifecycle.cc:596] failed to load 'tts_preprocessor-English-US' version 1: Invalid argument: instance group tts_preprocessor-English-US_0 of model tts_preprocessor-English-US has kind KIND_GPU but no GPUs are available I0324 18:14:03.975322 19 pipeline_library.cc:28] TRITONBACKEND_ModelInstanceInitialize: riva-punctuation-en-US_0 (device 0) cudaError_t 35 : "CUDA driver version is insufficient for CUDA runtime version" returned from 'cudaHostRegister( pinned_host_punctbuffer.data(), pinned_host_punctbuffer.size() sizeof(float), 0)' in fileriva/nlp/pipeline/punctuator/punctuator.cc line 158' cudaError_t 35 : "CUDA driver version is insufficient for CUDA runtime version" returned from 'cudaHostRegister( pinned_host_capitbuffer.data(), pinned_host_capitbuffer.size() sizeof(float), 0)' in fileriva/nlp/pipeline/punctuator/punctuator.cc line 160' cudaError_t 35 : "CUDA driver version is insufficient for CUDA runtime version" returned from 'cudaSetDevice(device_id)' in file./riva/pipeline/pipeline.h line 52' I0324 18:14:04.004839 19 model_lifecycle.cc:693] successfully loaded 'riva-punctuation-en-US' version 1 E0324 18:14:04.005023 19 model_repository_manager.cc:481] Invalid argument: ensemble 'conformer-en-US-asr-streaming' depends on 'riva-trt-conformer-en-US-asr-streaming-am-streaming' which has no loaded version E0324 18:14:04.005040 19 model_repository_manager.cc:481] Invalid argument: ensemble 'fastpitch_hifigan_ensemble-English-US' depends on 'tts_postprocessor-English-US' which has no loaded version E0324 18:14:04.005047 19 model_repository_manager.cc:481] Invalid argument: ensemble 'riva_intent_misty' depends on 'riva-trt-riva_intent_misty-nn-bert-base-uncased' which has no loaded version I0324 18:14:04.005108 19 server.cc:563] +------------------+------+ | Repository Agent | Path | +------------------+------+ +------------------+------+

I0324 18:14:04.005307 19 server.cc:590] +-----------------------+-------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+ Backend Path Config

+-----------------------+-------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+ | riva_asr_decoder | /opt/tritonserver/backends/riva_asr_decoder/libtriton_riva_asr_decoder.so | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"5.300000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} | | riva_nlp_detokenizer | /opt/tritonserver/backends/riva_nlp_detokenizer/libtriton_riva_nlp_detokenizer.so | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"5.300000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} | | riva_asr_endpointing | /opt/tritonserver/backends/riva_asr_endpointing/libtriton_riva_asr_endpointing.so | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"5.300000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} | | onnxruntime | /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"5.300000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} | | riva_nlp_tokenizer | /opt/tritonserver/backends/riva_nlp_tokenizer/libtriton_riva_nlp_tokenizer.so | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"5.300000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} | | riva_asr_features | /opt/tritonserver/backends/riva_asr_features/libtriton_riva_asr_features.so | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"5.300000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} | | riva_tts_preprocessor | /opt/tritonserver/backends/riva_tts_preprocessor/libtriton_riva_tts_preprocessor.so | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"5.300000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} | | riva_nlp_pipeline | /opt/tritonserver/backends/riva_nlp_pipeline/libtriton_riva_nlp_pipeline.so | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"5.300000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} | | riva_nlp_seqlabel | /opt/tritonserver/backends/riva_nlp_seqlabel/libtriton_riva_nlp_seqlabel.so | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"5.300000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} | | riva_tts_chunker | /opt/tritonserver/backends/riva_tts_chunker/libtriton_riva_tts_chunker.so | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"5.300000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} | +-----------------------+-------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0324 18:14:04.005534 19 server.cc:633] +-----------------------------------------------------------+---------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Model | Version | Status

                                                   |

+-----------------------------------------------------------+---------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming | 1 | READY

                                                   |

| conformer-en-US-asr-streaming-endpointing-streaming | 1 | READY

                                                   |

| conformer-en-US-asr-streaming-feature-extractor-streaming | 1 | UNAVAILABLE: Invalid argument: instance group conformer-en-US-asr-streaming-feature-extractor-streaming_0 of model conformer-en-US-asr-streaming-feature-extractor-streaming has kind KIND_GPU but no GPUs are available | | intent_slot_detokenizer | 1 | READY

                                                   |

| intent_slot_label_tokens_misty | 1 | READY

                                                   |

| intent_slot_tokenizer-en-US-misty | 1 | READY

                                                   |

| riva-onnx-fastpitch_encoder-English-US | 1 | UNAVAILABLE: Invalid argument: instance group riva-onnx-fastpitch_encoder-English-US_0 of model riva-onnx-fastpitch_encoder-English-US has kind KIND_GPU but no GPUs are available | | riva-punctuation-en-US | 1 | READY

                                                   |

| riva-trt-conformer-en-US-asr-streaming-am-streaming | 1 | UNAVAILABLE: Not found: unable to load shared library: /lib/aarch64-linux-gnu/libstdc++.so.6: version GLIBCXX_3.4.29' not found (required by /usr/lib/aarch64-linux-gnu/nvidia/libnvdla_compiler.so) | | riva-trt-hifigan-English-US | 1 | UNAVAILABLE: Not found: unable to load shared library: /lib/aarch64-linux-gnu/libstdc++.so.6: version GLIBCXX_3.4.29' not found (required by /usr/lib/aarch64-linux-gnu/nvidia/libnvdla_compiler.so) | | riva-trt-riva-punctuation-en-US-nn-bert-base-uncased | 1 | UNAVAILABLE: Not found: unable to load shared library: /lib/aarch64-linux-gnu/libstdc++.so.6: version GLIBCXX_3.4.29' not found (required by /usr/lib/aarch64-linux-gnu/nvidia/libnvdla_compiler.so) | | riva-trt-riva_intent_misty-nn-bert-base-uncased | 1 | UNAVAILABLE: Not found: unable to load shared library: /lib/aarch64-linux-gnu/libstdc++.so.6: version GLIBCXX_3.4.29' not found (required by /usr/lib/aarch64-linux-gnu/nvidia/libnvdla_compiler.so) | | spectrogram_chunker-English-US | 1 | UNAVAILABLE: Invalid argument: instance group spectrogram_chunker-English-US_0 of model spectrogram_chunker-English-US has kind KIND_GPU but no GPUs are available | | tts_postprocessor-English-US | 1 | UNAVAILABLE: Not found: unable to load shared library: /lib/aarch64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /usr/lib/aarch64-linux-gnu/nvidia/libnvdla_compiler.so) | | tts_preprocessor-English-US | 1 | UNAVAILABLE: Invalid argument: instance group tts_preprocessor-English-US_0 of model tts_preprocessor-English-US has kind KIND_GPU but no GPUs are available | +-----------------------------------------------------------+---------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

W0324 18:14:04.005631 19 metrics.cc:354] No polling metrics (CPU, GPU, Cache) are enabled. Will not poll for them. I0324 18:14:04.005809 19 tritonserver.cc:2264] +----------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Option | Value

|

+----------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | server_id | triton

|

| server_version | 2.27.0

|

| server_extensions | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics trace logging | | model_repository_path[0] | /data/models

|

| model_control_mode | MODE_NONE

|

| strict_model_config | 1

|

| rate_limit | OFF

|

| pinned_memory_pool_byte_size | 268435456

|

| cuda_memory_pool_byte_size{0} | 1000000000

|

| response_cache_byte_size | 0

|

| min_supported_compute_capability | 5.3

|

| strict_readiness | 1

|

| exit_timeout | 30

|

+----------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0324 18:14:04.005879 19 server.cc:264] Waiting for in-flight requests to complete. I0324 18:14:04.005953 19 server.cc:280] Timeout 30: Found 0 model versions that have in-flight inferences I0324 18:14:04.006321 19 server.cc:295] All models are stopped, unloading models I0324 18:14:04.006402 19 server.cc:302] Timeout 30: Found 6 live models and 0 in-flight non-inference requests I0324 18:14:04.008310 19 pipeline_library.cc:31] TRITONBACKEND_ModelInstanceFinalize: delete instance state cudaError_t 35 : "CUDA driver version is insufficient for CUDA runtime version" returned from 'cudaHostUnregister(pinned_host_punctbuffer.data())' in fileriva/nlp/pipeline/punctuator/punctuator.cc line 167' cudaError_t 35 : "CUDA driver version is insufficient for CUDA runtime version" returned from 'cudaHostUnregister(pinned_host_capitbuffer.data())' in fileriva/nlp/pipeline/punctuator/punctuator.cc line 168' I0324 18:14:04.008627 19 sequence_label_cbe.cc:142] TRITONBACKEND_ModelInstanceFinalize: delete instance state I0324 18:14:04.008697 19 sequence_label_cbe.cc:138] TRITONBACKEND_ModelFinalize: delete model state I0324 18:14:04.008732 19 ctc-decoder-library.cc:27] TRITONBACKEND_ModelInstanceFinalize: delete instance state I0324 18:14:04.008832 19 detokenizer_cbe.cc:150] TRITONBACKEND_ModelInstanceFinalize: delete instance state I0324 18:14:04.008845 19 tokenizer_library.cc:27] TRITONBACKEND_ModelInstanceFinalize: delete instance state I0324 18:14:04.008906 19 detokenizer_cbe.cc:146] TRITONBACKEND_ModelFinalize: delete model state I0324 18:14:04.008666 19 endpointing_library.cc:28] TRITONBACKEND_ModelInstanceFinalize: delete instance state I0324 18:14:04.009107 19 model_lifecycle.cc:578] successfully unloaded 'intent_slot_detokenizer' version 1 I0324 18:14:04.013533 19 pipeline_library.cc:27] TRITONBACKEND_ModelFinalize: delete model state I0324 18:14:04.014170 19 endpointing_library.cc:23] TRITONBACKEND_ModelFinalize: delete model state I0324 18:14:04.014555 19 tokenizer_library.cc:22] TRITONBACKEND_ModelFinalize: delete model state I0324 18:14:04.008742 19 model_lifecycle.cc:578] successfully unloaded 'intent_slot_label_tokens_misty' version 1 I0324 18:14:04.014816 19 model_lifecycle.cc:578] successfully unloaded 'conformer-en-US-asr-streaming-endpointing-streaming' version 1 I0324 18:14:04.015449 19 model_lifecycle.cc:578] successfully unloaded 'riva-punctuation-en-US' version 1 I0324 18:14:04.022604 19 model_lifecycle.cc:578] successfully unloaded 'intent_slot_tokenizer-en-US-misty' version 1 I0324 18:14:04.945356 19 ctc-decoder-library.cc:24] TRITONBACKEND_ModelFinalize: delete model state I0324 18:14:04.946123 19 model_lifecycle.cc:578] successfully unloaded 'conformer-en-US-asr-streaming-ctc-decoder-cpu-streaming' version 1 I0324 18:14:05.006539 19 server.cc:302] Timeout 29: Found 0 live models and 0 in-flight non-inference requests error: creating server: Internal - failed to load all models E0324 18:14:09.729276 21 model_registry.cc:286] error: unable to get server status: failed to connect to all addresses; last error: UNKNOWN: ipv4:127.0.0.1:8001: Failed to connect to remote host: Connection refused One of the processes has exited unexpectedly. Stopping container. W0324 18:14:09.747316 21 riva_server.cc:196] Signal: 15

— Reply to this email directly, view it on GitHubhttps://github.com/dusty-nv/jetson-containers/issues/450, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ADVEGK7KRABXMG5BY3A2BWLYZ4MGVAVCNFSM6AAAAABFF4XQUKVHI2DSMVQWIX3LMV43ASLTON2WKOZSGIYDINBZGMYTAMI. You are receiving this because you are subscribed to this thread.Message ID: @.***>

jasonthenderson commented 3 months ago

Thank you for confirming @dusty-nv !

ai-and-i commented 3 months ago

I got success running riva on JP6 by upgrading ubuntu inside the riva-speech image. Something like this:

docker run --name riva_jp6 -ti nvcr.io/nvidia/riva/riva-speech:2.14.0-l4t-aarch64 bash

# inside the docker container
apt-get update && apt-get upgrade -y
apt-get install -y ubuntu-release-upgrader-core
do-release-upgrade  # When prompted, choose to NOT clean obsolete packages
apt-get install -y curl
exit

# outside of the container
docker container commit riva_jp6 nvcr.io/nvidia/riva/riva-speech:2.14.0-l4t-aarch64
UserName-wang commented 3 months ago

@ai-and-i , I tried and it's working now ! Thank you!

dusty-nv commented 3 months ago

Hi guys, Riva release 2.15 is out now on NGC for JP6, however there is a known issue with the streaming TTS. That is interesting you got 2.14 to work!


From: UserName-wang @.> Sent: Thursday, March 28, 2024 10:20:29 AM To: dusty-nv/jetson-containers @.> Cc: Dustin Franklin @.>; Mention @.> Subject: Re: [dusty-nv/jetson-containers] RIVA doesn't seem to work with DP6 on Orin AGX (Issue #450)

@ai-and-ihttps://github.com/ai-and-i , I tried and it's working now ! Thank you!

— Reply to this email directly, view it on GitHubhttps://github.com/dusty-nv/jetson-containers/issues/450#issuecomment-2025308263, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ADVEGKYLCRNVZYSCYOOF3XTY2QRK3AVCNFSM6AAAAABFF4XQUKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMRVGMYDQMRWGM. You are receiving this because you were mentioned.Message ID: @.***>

johnnynunez commented 3 months ago

@dusty-nv now it's not possible to reflash agx with jetpack 6 with the latest version of jp6 and sdk manager

jasonthenderson commented 3 months ago

Thanks @dusty-nv and @ai-and-i - I tried upgrading my own container per @ai-and-i s instructions however it was a 2.12 container. It still does 'trying in 10 seconds...' but there are no error logs now when I use the docker command it provides at the end of the script.

So then I decided to download 2.15 and again ran into the same error where it displays the 'retrying in 10 seconds...' then when it errors out gives me no error logs when I run the docker riva-speech logs command....

Any ideas on where to look for what is going wrong - very odd to me that both give that same behavior now.

jasonthenderson commented 3 months ago

Ok a reboot of the Orin AGX got it so the Riva container is now working. It seems to be working for both 2.12 and 2.15..

@dusty-nv - I saw you post that you were setting up xtts - do you have any instructions posted on how to use that for streaming out to a speaker?