Open LL-AI-dev opened 8 months ago
After some fiddling, I was able to get the nemo2riva part of the deployment working using a riva docker image. It can create an environment that allows the conversion of .nemo to .riva files for both pretrained and finetuned (regular and adapter variants) of FastPitch and HifiGan.
The step to create a .rmir file is also successful in that the riva-build command completes without error. However when this model is being deployed by bash riva_start.sh, we get the new error:
failed to load 'riva-onnx-fastpitch_encoder-Jaz_v1' version 1: Internal: onnx runtime error 1: Load model from /data/models/riva-onnx-fastpitch_encoder-Jaz_v1/1/model.onnx failed:/workspace/onnxruntime/onnxruntime/core/graph/model.cc:146 onnxruntime::Model::Model(onnx::ModelProto&&, const PathString&, const IOnnxRuntimeOpSchemaRegistryList*, const onnxruntime::logging::Logger&, const onnxruntime::ModelOptions&) Unsupported model IR version: 9, max supported IR version: 8
(The full docker log is listed at the bottom)
DOCKERFILE
FROM nvcr.io/nvidia/riva/riva-speech:2.14.0-servicemaker
#make sure pip ist installed
RUN apt update && apt install python3-pip -y
#get a simple fastpitch model to convert
RUN wget --content-disposition \
'https://api.ngc.nvidia.com/v2/models/org/nvidia/team/nemo/tts_en_fastpitch/IPA_1.13.0/files?redirect=true&path=tts_en_fastpitch_align_ipa.nemo' \
-O tts_en_fastpitch_align_ipa.nemo
#install NeMo dependencies
RUN apt-get update && apt-get install -y libsndfile1 ffmpeg
RUN pip install Cython
#install nemo2riva dependencies
RUN pip install nvidia-pyindex
#install NeMo
RUN git clone https://github.com/NVIDIA/NeMo
WORKDIR NeMo
RUN git switch 'r1.23.0'
RUN pip install -e .
WORKDIR ..
#install some other required packages
RUN pip install matplotlib
RUN pip install einops
RUN pip install transformers
RUN pip install pandas
RUN pip install inflect
RUN pip install typing_extensions==4.7.1
RUN pip install wandb
RUN pip install youtokentome
RUN pip install editdistance
RUN pip install nemo_text_processing
RUN pip install lhotse
RUN pip install pyannote.audio
RUN pip install webdataset
RUN pip install datasets
RUN pip install jiwer
#install nemo2riva
RUN pip install nemo2riva==2.14.0
#fix the errors arising due to nemo requiring python 3.10
RUN sed -i '1i from __future__ import annotations' /usr/local/lib/python3.8/dist-packages/nemo/collections/common/tokenizers/canary_tokenizer.py
RUN sed -i '1i from __future__ import annotations' /usr/local/lib/python3.8/dist-packages/nemo/collections/asr/data/audio_to_text_lhotse.py
RUN sed -i '1i from __future__ import annotations' /usr/local/lib/python3.8/dist-packages/nemo/collections/asr/data/audio_to_text_lhotse_prompted.py
RUN sed -i '1i from __future__ import annotations' /usr/local/lib/python3.8/dist-packages/nemo/collections/common/data/lhotse/nemo_adapters.py
RUN sed -i '1i from __future__ import annotations' /usr/local/lib/python3.8/dist-packages/nemo/collections/common/data/lhotse/dataloader.py
RUN sed -i '1i from __future__ import annotations' /usr/local/lib/python3.8/dist-packages/hydra/_internal/utils.py
RUN sed -i '1i from __future__ import annotations' /usr/local/lib/python3.8/dist-packages/nemo/collections/asr/models/aed_multitask_models.py
RUN sed -i '1i from __future__ import annotations' /usr/local/lib/python3.8/dist-packages/hydra/_internal/utils.py
RUN sed -i '1i from __future__ import annotations' /usr/local/lib/python3.8/dist-packages/nemo/collections/asr/data/huggingface/hf_audio_to_text.py
RUN sed -i '1i from __future__ import annotations' /usr/local/lib/python3.8/dist-packages/nemo/collections/asr/parts/submodules/rnnt_greedy_decoding.py
below is the full docker log:
==========================
=== Riva Speech Skills ===
==========================
NVIDIA Release 23.12 (build 77214108)
Copyright (c) 2018-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES. All rights reserved.
This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license
> Riva waiting for Triton server to load all models...retrying in 1 second
I0304 06:27:43.963750 102 pinned_memory_manager.cc:240] Pinned memory pool is created at '0x7f0fc4000000' with size 268435456
I0304 06:27:43.966148 102 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 1000000000
I0304 06:27:43.971749 102 model_lifecycle.cc:459] loading: riva-onnx-fastpitch_encoder-Jaz_v1:1
I0304 06:27:43.971796 102 model_lifecycle.cc:459] loading: riva-trt-hifigan-Jaz_v1:1
I0304 06:27:43.971839 102 model_lifecycle.cc:459] loading: spectrogram_chunker-Jaz_v1:1
I0304 06:27:43.971880 102 model_lifecycle.cc:459] loading: tts_postprocessor-Jaz_v1:1
I0304 06:27:43.971934 102 model_lifecycle.cc:459] loading: tts_preprocessor-Jaz_v1:1
I0304 06:27:43.973206 102 onnxruntime.cc:2459] TRITONBACKEND_Initialize: onnxruntime
I0304 06:27:43.973231 102 onnxruntime.cc:2469] Triton TRITONBACKEND API version: 1.10
I0304 06:27:43.973236 102 onnxruntime.cc:2475] 'onnxruntime' TRITONBACKEND API version: 1.10
I0304 06:27:43.973241 102 onnxruntime.cc:2505] backend configuration:
{"cmdline":{"auto-complete-config":"false","min-compute-capability":"6.000000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}}
I0304 06:27:44.037319 102 tensorrt.cc:5444] TRITONBACKEND_Initialize: tensorrt
I0304 06:27:44.037343 102 tensorrt.cc:5454] Triton TRITONBACKEND API version: 1.10
I0304 06:27:44.037351 102 tensorrt.cc:5460] 'tensorrt' TRITONBACKEND API version: 1.10
I0304 06:27:44.037356 102 tensorrt.cc:5488] backend configuration:
{"cmdline":{"auto-complete-config":"false","min-compute-capability":"6.000000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}}
I0304 06:27:44.037540 102 onnxruntime.cc:2563] TRITONBACKEND_ModelInitialize: riva-onnx-fastpitch_encoder-Jaz_v1 (version 1)
I0304 06:27:44.466490 102 onnxruntime.cc:2606] TRITONBACKEND_ModelInstanceInitialize: riva-onnx-fastpitch_encoder-Jaz_v1_0 (GPU device 0)
I0304 06:27:44.583420 102 onnxruntime.cc:2640] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0304 06:27:44.583435 102 tensorrt.cc:5578] TRITONBACKEND_ModelInitialize: riva-trt-hifigan-Jaz_v1 (version 1)
I0304 06:27:44.583459 102 onnxruntime.cc:2586] TRITONBACKEND_ModelFinalize: delete model state
E0304 06:27:44.583484 102 model_lifecycle.cc:596] failed to load 'riva-onnx-fastpitch_encoder-Jaz_v1' version 1: Internal: onnx runtime error 1: Load model from /data/models/riva-onnx-fastpitch_encoder-Jaz_v1/1/model.onnx failed:/workspace/onnxruntime/onnxruntime/core/graph/model.cc:146 onnxruntime::Model::Model(onnx::ModelProto&&, const PathString&, const IOnnxRuntimeOpSchemaRegistryList*, const onnxruntime::logging::Logger&, const onnxruntime::ModelOptions&) Unsupported model IR version: 9, max supported IR version: 8
I0304 06:27:44.583984 102 backend_model.cc:188] Overriding execution policy to "TRITONBACKEND_EXECUTION_BLOCKING" for sequence model "riva-trt-hifigan-Jaz_v1"
I0304 06:27:44.584806 102 spectrogram-chunker.cc:270] TRITONBACKEND_ModelInitialize: spectrogram_chunker-Jaz_v1 (version 1)
I0304 06:27:44.585550 102 backend_model.cc:303] model configuration:
{
"name": "spectrogram_chunker-Jaz_v1",
"platform": "",
"backend": "riva_tts_chunker",
"version_policy": {
"latest": {
"num_versions": 1
}
},
"max_batch_size": 8,
"input": [
{
"name": "SPECTROGRAM",
"data_type": "TYPE_FP32",
"format": "FORMAT_NONE",
"dims": [
80,
-1
],
"is_shape_tensor": false,
"allow_ragged_batch": false,
"optional": false
},
{
"name": "IS_LAST_SENTENCE",
"data_type": "TYPE_INT32",
"format": "FORMAT_NONE",
"dims": [
1
],
"is_shape_tensor": false,
"allow_ragged_batch": false,
"optional": false
},
{
"name": "NUM_VALID_FRAMES_IN",
"data_type": "TYPE_INT64",
"format": "FORMAT_NONE",
"dims": [
1
],
"is_shape_tensor": false,
"allow_ragged_batch": false,
"optional": false
},
{
"name": "SENTENCE_NUM",
"data_type": "TYPE_INT32",
"format": "FORMAT_NONE",
"dims": [
1
],
"is_shape_tensor": false,
"allow_ragged_batch": false,
"optional": false
},
{
"name": "DURATIONS",
"data_type": "TYPE_FP32",
"format": "FORMAT_NONE",
"dims": [
-1
],
"is_shape_tensor": false,
"allow_ragged_batch": false,
"optional": false
},
{
"name": "PROCESSED_TEXT",
"data_type": "TYPE_STRING",
"format": "FORMAT_NONE",
"dims": [
1
],
"is_shape_tensor": false,
"allow_ragged_batch": false,
"optional": false
},
{
"name": "VOLUME",
"data_type": "TYPE_FP32",
"format": "FORMAT_NONE",
"dims": [
-1
],
"is_shape_tensor": false,
"allow_ragged_batch": false,
"optional": false
}
],
"output": [
{
"name": "SPECTROGRAM_CHUNK",
"data_type": "TYPE_FP32",
"dims": [
80,
-1
],
"label_filename": "",
"is_shape_tensor": false
},
{
"name": "END_FLAG",
"data_type": "TYPE_INT32",
"dims": [
1
],
"label_filename": "",
"is_shape_tensor": false
},
{
"name": "NUM_VALID_SAMPLES_OUT",
"data_type": "TYPE_INT32",
"dims": [
1
],
"label_filename": "",
"is_shape_tensor": false
},
{
"name": "SENTENCE_NUM",
"data_type": "TYPE_INT32",
"dims": [
1
],
"label_filename": "",
"is_shape_tensor": false
},
{
"name": "DURATIONS",
"data_type": "TYPE_FP32",
"dims": [
-1
],
"label_filename": "",
"is_shape_tensor": false
},
{
"name": "PROCESSED_TEXT",
"data_type": "TYPE_STRING",
"dims": [
1
],
"label_filename": "",
"is_shape_tensor": false
},
{
"name": "VOLUME",
"data_type": "TYPE_FP32",
"dims": [
-1
],
"label_filename": "",
"is_shape_tensor": false
}
],
"batch_input": [],
"batch_output": [],
"optimization": {
"priority": "PRIORITY_DEFAULT",
"input_pinned_memory": {
"enable": true
},
"output_pinned_memory": {
"enable": true
},
"gather_kernel_buffer_threshold": 0,
"eager_batching": false
},
"sequence_batching": {
"oldest": {
"max_candidate_sequences": 8,
"preferred_batch_size": [
8
],
"max_queue_delay_microseconds": 1000
},
"max_sequence_idle_microseconds": 60000000,
"control_input": [
{
"name": "START",
"control": [
{
"kind": "CONTROL_SEQUENCE_START",
"int32_false_true": [
0,
1
],
"fp32_false_true": [],
"bool_false_true": [],
"data_type": "TYPE_INVALID"
}
]
},
{
"name": "READY",
"control": [
{
"kind": "CONTROL_SEQUENCE_READY",
"int32_false_true": [
0,
1
],
"fp32_false_true": [],
"bool_false_true": [],
"data_type": "TYPE_INVALID"
}
]
},
{
"name": "END",
"control": [
{
"kind": "CONTROL_SEQUENCE_END",
"int32_false_true": [
0,
1
],
"fp32_false_true": [],
"bool_false_true": [],
"data_type": "TYPE_INVALID"
}
]
},
{
"name": "CORRID",
"control": [
{
"kind": "CONTROL_SEQUENCE_CORRID",
"int32_false_true": [],
"fp32_false_true": [],
"bool_false_true": [],
"data_type": "TYPE_UINT64"
}
]
}
],
"state": []
},
"instance_group": [
{
"name": "spectrogram_chunker-Jaz_v1_0",
"kind": "KIND_GPU",
"count": 1,
"gpus": [
0
],
"secondary_devices": [],
"profile": [],
"passive": false,
"host_policy": ""
}
],
"default_model_filename": "",
"cc_model_filenames": {},
"metric_tags": {},
"parameters": {
"num_mels": {
"string_value": "80"
},
"num_samples_per_frame": {
"string_value": "512"
},
"supports_volume": {
"string_value": "True"
},
"chunk_length": {
"string_value": "80"
},
"max_execution_batch_size": {
"string_value": "8"
}
},
"model_warmup": [],
"model_transaction_policy": {
"decoupled": true
}
}
I0304 06:27:44.585599 102 tensorrt.cc:5627] TRITONBACKEND_ModelInstanceInitialize: riva-trt-hifigan-Jaz_v1_0 (GPU device 0)
I0304 06:27:44.640208 102 logging.cc:49] Loaded engine size: 28 MiB
> Riva waiting for Triton server to load all models...retrying in 1 second
I0304 06:27:44.787852 102 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +31, now: CPU 0, GPU 31 (MiB)
I0304 06:27:44.796150 102 logging.cc:49] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +186, now: CPU 0, GPU 217 (MiB)
I0304 06:27:44.796536 102 tensorrt.cc:1547] Created instance riva-trt-hifigan-Jaz_v1_0 on GPU 0 with stream priority 0 and optimization profile default[0];
I0304 06:27:44.796915 102 model_lifecycle.cc:693] successfully loaded 'riva-trt-hifigan-Jaz_v1' version 1
I0304 06:27:44.802776 102 spectrogram-chunker.cc:272] TRITONBACKEND_ModelInstanceInitialize: spectrogram_chunker-Jaz_v1_0 (device 0)
I0304 06:27:44.802834 102 tts-postprocessor.cc:305] TRITONBACKEND_ModelInitialize: tts_postprocessor-Jaz_v1 (version 1)
I0304 06:27:44.803194 102 model_lifecycle.cc:693] successfully loaded 'spectrogram_chunker-Jaz_v1' version 1
I0304 06:27:44.803487 102 backend_model.cc:303] model configuration:
{
"name": "tts_postprocessor-Jaz_v1",
"platform": "",
"backend": "riva_tts_postprocessor",
"version_policy": {
"latest": {
"num_versions": 1
}
},
"max_batch_size": 8,
"input": [
{
"name": "INPUT",
"data_type": "TYPE_FP32",
"format": "FORMAT_NONE",
"dims": [
1,
-1
],
"is_shape_tensor": false,
"allow_ragged_batch": false,
"optional": false
},
{
"name": "NUM_VALID_SAMPLES",
"data_type": "TYPE_INT32",
"format": "FORMAT_NONE",
"dims": [
1
],
"is_shape_tensor": false,
"allow_ragged_batch": false,
"optional": false
},
{
"name": "Prosody_volume",
"data_type": "TYPE_FP32",
"format": "FORMAT_NONE",
"dims": [
-1
],
"is_shape_tensor": false,
"allow_ragged_batch": false,
"optional": false
}
],
"output": [
{
"name": "OUTPUT",
"data_type": "TYPE_FP32",
"dims": [
-1
],
"label_filename": "",
"is_shape_tensor": false
}
],
"batch_input": [],
"batch_output": [],
"optimization": {
"priority": "PRIORITY_DEFAULT",
"input_pinned_memory": {
"enable": true
},
"output_pinned_memory": {
"enable": true
},
"gather_kernel_buffer_threshold": 0,
"eager_batching": false
},
"sequence_batching": {
"oldest": {
"max_candidate_sequences": 8,
"preferred_batch_size": [
8
],
"max_queue_delay_microseconds": 100
},
"max_sequence_idle_microseconds": 60000000,
"control_input": [
{
"name": "START",
"control": [
{
"kind": "CONTROL_SEQUENCE_START",
"int32_false_true": [
0,
1
],
"fp32_false_true": [],
"bool_false_true": [],
"data_type": "TYPE_INVALID"
}
]
},
{
"name": "READY",
"control": [
{
"kind": "CONTROL_SEQUENCE_READY",
"int32_false_true": [
0,
1
],
"fp32_false_true": [],
"bool_false_true": [],
"data_type": "TYPE_INVALID"
}
]
},
{
"name": "END",
"control": [
{
"kind": "CONTROL_SEQUENCE_END",
"int32_false_true": [
0,
1
],
"fp32_false_true": [],
"bool_false_true": [],
"data_type": "TYPE_INVALID"
}
]
},
{
"name": "CORRID",
"control": [
{
"kind": "CONTROL_SEQUENCE_CORRID",
"int32_false_true": [],
"fp32_false_true": [],
"bool_false_true": [],
"data_type": "TYPE_UINT64"
}
]
}
],
"state": []
},
"instance_group": [
{
"name": "tts_postprocessor-Jaz_v1_0",
"kind": "KIND_GPU",
"count": 1,
"gpus": [
0
],
"secondary_devices": [],
"profile": [],
"passive": false,
"host_policy": ""
}
],
"default_model_filename": "",
"cc_model_filenames": {},
"metric_tags": {},
"parameters": {
"use_denoiser": {
"string_value": "False"
},
"supports_volume": {
"string_value": "True"
},
"filter_length": {
"string_value": "1024"
},
"fade_length": {
"string_value": "256"
},
"num_samples_per_frame": {
"string_value": "512"
},
"chunk_num_samples": {
"string_value": "40960"
},
"max_execution_batch_size": {
"string_value": "8"
},
"max_chunk_size": {
"string_value": "131072"
},
"hop_length": {
"string_value": "256"
}
},
"model_warmup": [],
"model_transaction_policy": {
"decoupled": false
}
}
I0304 06:27:44.803568 102 tts-postprocessor.cc:307] TRITONBACKEND_ModelInstanceInitialize: tts_postprocessor-Jaz_v1_0 (device 0)
I0304 06:27:44.824235 102 tts-preprocessor.cc:337] TRITONBACKEND_ModelInitialize: tts_preprocessor-Jaz_v1 (version 1)
I0304 06:27:44.824489 102 model_lifecycle.cc:693] successfully loaded 'tts_postprocessor-Jaz_v1' version 1
W0304 06:27:44.824928 102 tts-preprocessor.cc:284] Parameter abbreviation_path is deprecated
WARNING: Logging before InitGoogleLogging() is written to STDERR
I0304 06:27:44.824993 112 preprocessor.cc:231] TTS character mapping loaded from /data/models/tts_preprocessor-Jaz_v1/1/mapping.txt
I0304 06:27:44.921231 112 preprocessor.cc:269] TTS phonetic mapping loaded from /data/models/tts_preprocessor-Jaz_v1/1/ipa_cmudict-0.7b_nv22.08.txt
I0304 06:27:44.921326 112 preprocessor.cc:282] Abbreviation mapping loaded from /data/models/tts_preprocessor-Jaz_v1/1/abbr.txt
I0304 06:27:44.921344 112 normalize.cc:52] Speech Class far file missing:/data/models/tts_preprocessor-Jaz_v1/1/speech_class.far
I0304 06:27:45.010165 112 preprocessor.cc:292] TTS normalizer loaded from /data/models/tts_preprocessor-Jaz_v1/1/
I0304 06:27:45.010266 102 backend_model.cc:303] model configuration:
{
"name": "tts_preprocessor-Jaz_v1",
"platform": "",
"backend": "riva_tts_preprocessor",
"version_policy": {
"latest": {
"num_versions": 1
}
},
"max_batch_size": 8,
"input": [
{
"name": "input_string",
"data_type": "TYPE_STRING",
"format": "FORMAT_NONE",
"dims": [
1
],
"is_shape_tensor": false,
"allow_ragged_batch": false,
"optional": false
}
],
"output": [
{
"name": "output",
"data_type": "TYPE_INT64",
"dims": [
-1
],
"label_filename": "",
"is_shape_tensor": false
},
{
"name": "output_mask",
"data_type": "TYPE_FP32",
"dims": [
1,
400,
1
],
"label_filename": "",
"is_shape_tensor": false
},
{
"name": "output_length",
"data_type": "TYPE_INT32",
"dims": [
1
],
"label_filename": "",
"is_shape_tensor": false
},
{
"name": "is_last_sentence",
"data_type": "TYPE_INT32",
"dims": [
1
],
"label_filename": "",
"is_shape_tensor": false
},
{
"name": "output_string",
"data_type": "TYPE_STRING",
"dims": [
1
],
"label_filename": "",
"is_shape_tensor": false
},
{
"name": "sentence_num",
"data_type": "TYPE_INT32",
"dims": [
1
],
"label_filename": "",
"is_shape_tensor": false
},
{
"name": "pitch",
"data_type": "TYPE_FP32",
"dims": [
-1
],
"label_filename": "",
"is_shape_tensor": false
},
{
"name": "duration",
"data_type": "TYPE_FP32",
"dims": [
-1
],
"label_filename": "",
"is_shape_tensor": false
},
{
"name": "volume",
"data_type": "TYPE_FP32",
"dims": [
-1
],
"label_filename": "",
"is_shape_tensor": false
}
],
"batch_input": [],
"batch_output": [],
"optimization": {
"graph": {
"level": 0
},
"priority": "PRIORITY_DEFAULT",
"cuda": {
"graphs": false,
"busy_wait_events": false,
"graph_spec": [],
"output_copy_stream": true
},
"input_pinned_memory": {
"enable": true
},
"output_pinned_memory": {
"enable": true
},
"gather_kernel_buffer_threshold": 0,
"eager_batching": false
},
"sequence_batching": {
"oldest": {
"max_candidate_sequences": 8,
"preferred_batch_size": [
8
],
"max_queue_delay_microseconds": 100
},
"max_sequence_idle_microseconds": 60000000,
"control_input": [
{
"name": "START",
"control": [
{
"kind": "CONTROL_SEQUENCE_START",
"int32_false_true": [
0,
1
],
"fp32_false_true": [],
"bool_false_true": [],
"data_type": "TYPE_INVALID"
}
]
},
{
"name": "READY",
"control": [
{
"kind": "CONTROL_SEQUENCE_READY",
"int32_false_true": [
0,
1
],
"fp32_false_true": [],
"bool_false_true": [],
"data_type": "TYPE_INVALID"
}
]
},
{
"name": "END",
"control": [
{
"kind": "CONTROL_SEQUENCE_END",
"int32_false_true": [
0,
1
],
"fp32_false_true": [],
"bool_false_true": [],
"data_type": "TYPE_INVALID"
}
]
},
{
"name": "CORRID",
"control": [
{
"kind": "CONTROL_SEQUENCE_CORRID",
"int32_false_true": [],
"fp32_false_true": [],
"bool_false_true": [],
"data_type": "TYPE_UINT64"
}
]
}
],
"state": []
},
"instance_group": [
{
"name": "tts_preprocessor-Jaz_v1_0",
"kind": "KIND_GPU",
"count": 1,
"gpus": [
0
],
"secondary_devices": [],
"profile": [],
"passive": false,
"host_policy": ""
}
],
"default_model_filename": "",
"cc_model_filenames": {},
"metric_tags": {},
"parameters": {
"max_sequence_length": {
"string_value": "400"
},
"supports_speaker_mixing": {
"string_value": "False"
},
"upper_case_chars": {
"string_value": "True"
},
"g2p_ignore_ambiguous": {
"string_value": "False"
},
"phone_set": {
"string_value": "ipa"
},
"dictionary_path": {
"string_value": "/data/models/tts_preprocessor-Jaz_v1/1/ipa_cmudict-0.7b_nv22.08.txt"
},
"abbreviations_path": {
"string_value": "/data/models/tts_preprocessor-Jaz_v1/1/abbr.txt"
},
"supports_ragged_batches": {
"string_value": "True"
},
"norm_proto_path": {
"string_value": "/data/models/tts_preprocessor-Jaz_v1/1/"
},
"mapping_path": {
"string_value": "/data/models/tts_preprocessor-Jaz_v1/1/mapping.txt"
},
"normalize_pitch": {
"string_value": "True"
},
"upper_case_g2p": {
"string_value": "True"
},
"pitch_std": {
"string_value": "50.46181106567383"
},
"max_input_length": {
"string_value": "2000"
},
"language": {
"string_value": "en-US"
},
"pad_with_space": {
"string_value": "True"
},
"subvoices": {
"string_value": "0:0"
}
},
"model_warmup": [],
"model_transaction_policy": {
"decoupled": true
}
}
I0304 06:27:45.010360 102 tts-preprocessor.cc:339] TRITONBACKEND_ModelInstanceInitialize: tts_preprocessor-Jaz_v1_0 (device 0)
I0304 06:27:45.010680 102 model_lifecycle.cc:693] successfully loaded 'tts_preprocessor-Jaz_v1' version 1
E0304 06:27:45.010726 102 model_repository_manager.cc:481] Invalid argument: ensemble 'fastpitch_hifigan_ensemble-Jaz_v1' depends on 'riva-onnx-fastpitch_encoder-Jaz_v1' which has no loaded version
I0304 06:27:45.010785 102 server.cc:563]
+------------------+------+
| Repository Agent | Path |
+------------------+------+
+------------------+------+
I0304 06:27:45.010857 102 server.cc:590]
+------------------------+---------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Backend | Path | Config |
+------------------------+---------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| onnxruntime | /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"6.000000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} |
| riva_tts_preprocessor | /opt/tritonserver/backends/riva_tts_preprocessor/libtriton_riva_tts_preprocessor.so | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"6.000000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} |
| tensorrt | /opt/tritonserver/backends/tensorrt/libtriton_tensorrt.so | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"6.000000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} |
| riva_tts_chunker | /opt/tritonserver/backends/riva_tts_chunker/libtriton_riva_tts_chunker.so | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"6.000000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} |
| riva_tts_postprocessor | /opt/tritonserver/backends/riva_tts_postprocessor/libtriton_riva_tts_postprocessor.so | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"6.000000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} |
+------------------------+---------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
I0304 06:27:45.010956 102 server.cc:633]
+------------------------------------+---------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Model | Version | Status |
+------------------------------------+---------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| riva-onnx-fastpitch_encoder-Jaz_v1 | 1 | UNAVAILABLE: Internal: onnx runtime error 1: Load model from /data/models/riva-onnx-fastpitch_encoder-Jaz_v1/1/model.onnx failed:/workspace/onnxruntime/onnxruntime/core/graph/model.cc:146 onnxruntime::Model::Model(onnx::ModelProto&&, const PathString&, const IOnnxRuntimeOpSchemaRegistryList*, const onnxruntime::logging::Logger&, const onnxruntime::ModelOptions&) Unsupported model IR version: 9, max supported IR version: 8 |
| riva-trt-hifigan-Jaz_v1 | 1 | READY |
| spectrogram_chunker-Jaz_v1 | 1 | READY |
| tts_postprocessor-Jaz_v1 | 1 | READY |
| tts_preprocessor-Jaz_v1 | 1 | READY |
+------------------------------------+---------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
I0304 06:27:45.084929 102 metrics.cc:864] Collecting metrics for GPU 0: Tesla T4
I0304 06:27:45.085172 102 metrics.cc:757] Collecting CPU metrics
I0304 06:27:45.085354 102 tritonserver.cc:2264]
+----------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option | Value |
+----------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id | triton |
| server_version | 2.27.0 |
| server_extensions | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics trace logging |
| model_repository_path[0] | /data/models |
| model_control_mode | MODE_NONE |
| strict_model_config | 1 |
| rate_limit | OFF |
| pinned_memory_pool_byte_size | 268435456 |
| cuda_memory_pool_byte_size{0} | 1000000000 |
| response_cache_byte_size | 0 |
| min_supported_compute_capability | 6.0 |
| strict_readiness | 1 |
| exit_timeout | 30 |
+----------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
I0304 06:27:45.085368 102 server.cc:264] Waiting for in-flight requests to complete.
I0304 06:27:45.085383 102 server.cc:280] Timeout 30: Found 0 model versions that have in-flight inferences
I0304 06:27:45.085634 102 server.cc:295] All models are stopped, unloading models
I0304 06:27:45.085638 102 tts-postprocessor.cc:310] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0304 06:27:45.085652 102 server.cc:302] Timeout 30: Found 4 live models and 0 in-flight non-inference requests
I0304 06:27:45.085697 102 tensorrt.cc:5665] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0304 06:27:45.085746 102 spectrogram-chunker.cc:275] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0304 06:27:45.085788 102 tts-preprocessor.cc:342] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0304 06:27:45.085799 102 spectrogram-chunker.cc:271] TRITONBACKEND_ModelFinalize: delete model state
I0304 06:27:45.085826 102 tts-preprocessor.cc:338] TRITONBACKEND_ModelFinalize: delete model state
I0304 06:27:45.085946 102 model_lifecycle.cc:578] successfully unloaded 'spectrogram_chunker-Jaz_v1' version 1
I0304 06:27:45.092669 102 tts-postprocessor.cc:306] TRITONBACKEND_ModelFinalize: delete model state
I0304 06:27:45.092836 102 model_lifecycle.cc:578] successfully unloaded 'tts_postprocessor-Jaz_v1' version 1
I0304 06:27:45.102135 102 tensorrt.cc:5604] TRITONBACKEND_ModelFinalize: delete model state
I0304 06:27:45.102301 102 model_lifecycle.cc:578] successfully unloaded 'riva-trt-hifigan-Jaz_v1' version 1
I0304 06:27:45.106678 102 model_lifecycle.cc:578] successfully unloaded 'tts_preprocessor-Jaz_v1' version 1
> Riva waiting for Triton server to load all models...retrying in 1 second
I0304 06:27:46.085741 102 server.cc:302] Timeout 29: Found 0 live models and 0 in-flight non-inference requests
error: creating server: Internal - failed to load all models
> Riva waiting for Triton server to load all models...retrying in 1 second
> Riva waiting for Triton server to load all models...retrying in 1 second
> Triton server died before reaching ready state. Terminating Riva startup.
Check Triton logs with: docker logs
hello, I have got similar issue on deploying a finetuned conformer model. Any updates? Thank you.
Hardware - GPU (T4) Hardware - CPU Operating System - ubuntu 20.04 running on AWS EC2 g4dn.2xlarge instance
I am currently trying to convert a model (several of different types but for now not even a FastPitch model is working) In the past i had deployed several nemo pipelines to riva but that developing environment was lost during some updates and I have not been able to convert and deploy any nemo models since. I believe this lost environment was using nemo 1.20.0 and riva & nemo2riva version 2.13.1, however using those versions does not seem to work for me anymore.
Recently I have been testing several versions of nemo, nemo2riva and riva using the dockerfile below in order to deploy models. (will update with testing data as I continue to try and retry combinations)
As you can see in the dockerfile, it is using a pretrained model from ngc, however I get the same error even on a .nemo model that was trained using the latest nemo version
The error relates to nvidia-eff and being unable to encrypt the model. This error is consistent regardless of the nemo image used. I have also tried using a pytorch base image but this results in the same errors. I tried using a riva-servicemaker image too but that had an issue arising because the image uses python 3.8 but nemo has required 3.10 for a long time.
I really need some help resolving this as development is being delayed as I cannot update any ASR, NLP, or TTS models currently. How can I resolve this?