Open fr0zenshard opened 4 months ago
I've tried trained models from previous version of NeMo and newly trained within version 1.23.0 and etc.
hey?
I've been fighting this for days. Using the latest nemo, riva, etc in a fresh venv, I hit the same problem, but adding "--onnx-opset=14" to the nemo2riva command for fastpitch seems to be working
Description
When I try to export a model to Riva (using nemo2riva==2.14.0), the FastPitch model export doesn't work (meanwhile, the HiFiGAN export works wonderfully). I've tried various combinations of onnx and onnxruntime, combinations of NeMo and nemo2riva versions, but in the end, nothing works and the following error always pops up:
onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Failed to load model with error: Invalid tensor data type 0.
Steps/Code to reproduce bug
Using NeMo image (1.21.0, 1.22.0, 1.23.0) + installed nemo2riva==2.14.0:
nemo2riva --out $(OUT_RIVA).riva $(NEMO_MODEL) --key tlt_encode --runtime-check
Additional info
--runtime-check
you can get exported .riva model, but it doesn't work by the end (and the model size is huge, also)The failed results mentioned by the authors of the issue above can be obtained by:
runtime_check
In such an environment, the build and deploy of rmir will occur, but at the start of the Riva server, there will be either an:
-UNAVAILABLE: Internal: onnx runtime error 1: Load model from /data/models/riva-onnx-fastpitch_encoder-test/1/model.onnx failed:Invalid tensor data type 0.