microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
https://onnxruntime.ai
MIT License
14.44k stars 2.9k forks source link

onnx inferencesession load hangs with quartznet asr config. #4974

Closed darraghdog closed 4 years ago

darraghdog commented 4 years ago

Describe the bug The quartznet onnx exported model cannot be loaded for inference. The code snippet below runs fine up until encoder_session = onnxruntime.InferenceSession(filename), at that point the code just hangs, no error but hangs. I left it for ~ 10mins.
See code snippet below. The model generated from the first config works for me, the model generated by the second config does not work for me.

Urgency We would like to use this in a product being launched later this year. Speeding up inference would be very helpful.

System information

To Reproduce

import os
import tempfile
import torch
import onnx
import pytest
from omegaconf import DictConfig
from nemo.collections.asr.modules import ConvASRDecoder, ConvASREncoder

# This config can be loaded via the Inference Session
encoder_dict = {
    'cls': 'nemo.collections.asr.modules.ConvASREncoder',
    'params': {
        'feat_in': 64,
        'activation': 'relu',
        'conv_mask': True,
        'jasper': [
            {'filters': 1024, 'repeat': 1, 'kernel': [1], 'stride': [1], 'dilation': [1],'dropout': 0.0,'residual': False, 'separable': True,'se': True, 'se_context_size': -1}
        ],
    },
}

# This config cannot be loaded via the Inference Session
encoder_dict = {
    'cls': 'nemo.collections.asr.modules.ConvASREncoder', 
    'params': {
        'feat_in': 64, 
        'activation': 'relu', 
        'conv_mask': True, 
        'jasper': 
            [{'filters': 256, 'repeat': 1, 'kernel': [33], 'stride': [2], 'dilation': [1], 'dropout': 0.0, 'residual': False, 'separable': True}, 
             {'filters': 256, 'repeat': 5, 'kernel': [33], 'stride': [1], 'dilation': [1], 'dropout': 0.0, 'residual': True, 'separable': True}, 
             {'filters': 256, 'repeat': 5, 'kernel': [33], 'stride': [1], 'dilation': [1], 'dropout': 0.0, 'residual': True, 'separable': True}, 
             {'filters': 256, 'repeat': 5, 'kernel': [33], 'stride': [1], 'dilation': [1], 'dropout': 0.0, 'residual': True, 'separable': True}, 
             {'filters': 256, 'repeat': 5, 'kernel': [39], 'stride': [1], 'dilation': [1], 'dropout': 0.0, 'residual': True, 'separable': True}, 
             {'filters': 256, 'repeat': 5, 'kernel': [39], 'stride': [1], 'dilation': [1], 'dropout': 0.0, 'residual': True, 'separable': True}, 
             {'filters': 256, 'repeat': 5, 'kernel': [39], 'stride': [1], 'dilation': [1], 'dropout': 0.0, 'residual': True, 'separable': True},
             {'filters': 512, 'repeat': 5, 'kernel': [51], 'stride': [1], 'dilation': [1], 'dropout': 0.0, 'residual': True, 'separable': True}, 
             {'filters': 512, 'repeat': 5, 'kernel': [51], 'stride': [1], 'dilation': [1], 'dropout': 0.0, 'residual': True, 'separable': True}, 
             {'filters': 512, 'repeat': 5, 'kernel': [51], 'stride': [1], 'dilation': [1], 'dropout': 0.0, 'residual': True, 'separable': True}, 
             {'filters': 512, 'repeat': 5, 'kernel': [63], 'stride': [1], 'dilation': [1], 'dropout': 0.0, 'residual': True, 'separable': True}, 
             {'filters': 512, 'repeat': 5, 'kernel': [63], 'stride': [1], 'dilation': [1], 'dropout': 0.0, 'residual': True, 'separable': True}, 
             {'filters': 512, 'repeat': 5, 'kernel': [63], 'stride': [1], 'dilation': [1], 'dropout': 0.0, 'residual': True, 'separable': True}, 
             {'filters': 512, 'repeat': 5, 'kernel': [75], 'stride': [1], 'dilation': [1], 'dropout': 0.0, 'residual': True, 'separable': True}, 
             {'filters': 512, 'repeat': 5, 'kernel': [75], 'stride': [1], 'dilation': [1], 'dropout': 0.0, 'residual': True, 'separable': True}, 
             {'filters': 512, 'repeat': 5, 'kernel': [75], 'stride': [1], 'dilation': [1], 'dropout': 0.0, 'residual': True, 'separable': True}, 
             {'filters': 512, 'repeat': 1, 'kernel': [87], 'stride': [1], 'dilation': [2], 'dropout': 0.0, 'residual': False, 'separable': True}, 
             {'filters': 1024, 'repeat': 1, 'kernel': [1], 'stride': [1], 'dilation': [1], 'dropout': 0.0, 'residual': False}
             ]
            }
        }

INPATH = '/Users/dhanley/Documents/gazeapi/ai-tasks/gazeproto/experiment'
filename  = os.path.join(INPATH, 'qn_encoder.onnx')
encoder_instance = ConvASREncoder.from_config_dict(DictConfig(encoder_dict))
encoder_instance.export(output=filename)
# Create dummy inputs
dummy_processed_signal  = torch.randn([1, 64, 5104])
dummy_processed_signal_len =  torch.tensor([5103])
# Check model
quartznet_encoder = onnx.load(filename)
onnx.checker.check_model(quartznet_encoder)
# Load inference session
import onnxruntime
encoder_session = onnxruntime.InferenceSession(filename)

Expected behavior The inference session for the second config should load, just as well as the first session.

darraghdog commented 4 years ago

Sorry, I meant to enter this under the repo which devloped the model. Thanks for the great work, loving onnx!