nvidia-riva / nemo2riva

NeMo -> Riva Conversion Tool
MIT License
9 stars 9 forks source link

Conformer CTC converted with nemo2riva 2.13.1 deployed on Riva 2.13.1 fails to load #36

Open itzsimpl opened 10 months ago

itzsimpl commented 10 months ago

I have a conformer CTC model built with the NeMo framework (https://github.com/NVIDIA/NeMo), which can be normally converted and deployed with Riva 2.11.0. However, if I convert the same NeMo file to Riva 2.13.1, and deploy, Riva (Triton server) fails to start with the error

UNAVAILABLE: Internal: onnx runtime error 1: Load model from /data/models/streaming/1/model.onnx failed :/workspace/onnxruntime/onnxruntime/core/graph/model.cc:146 onnxruntime::Model::Model(onnx::ModelProto&&, const PathString&, const IOnnxRuntimeOpSchemaRegistryList*, const onnxruntime::logging::Logger&, constonnxruntime::ModelOptions&) Unsupported model IR version: 9, max supported IR version: 8

I have tried building with --onnx_opset=15, and --onnx_opset=17, like it was mentioned in https://github.com/NVIDIA/NeMo/discussions/7278, but nothing helps.

itzsimpl commented 9 months ago

The same issue exists with Riva 2.14.0, even when deploying a model built with the latest Nemo (1.22.0). It seems to be caused by an incompatibility between the onnx library used by Nemo (1.14.0) and the Riva Triton server's one.

A workaround is to downgrade the onnx library to 1.13.0 prior to nemo2riva conversion, or build a TRT engine -- avoid using the parameter --nn.use_onnx_runtime when building the speech recognition Riva pipeline (i.e. conversion from .riva to .rmir with riva-build).