neuralmagic / deepsparse

Sparsity-aware deep learning inference runtime for CPUs
https://neuralmagic.com/deepsparse/
Other
3.04k stars 173 forks source link

[Question] about converting onnx model with dynamic batch size input to deepsparse model #1647

Open phamkhactu opened 6 months ago

phamkhactu commented 6 months ago

I tried to convert onnx model with dynamic batch size into deepsparse

from deepsparse import compile_model
from deepsparse.utils import generate_random_inputs
onnx_filepath = "tts_model.onnx"
batch_size = 4

# Generate random sample input
inputs = generate_random_inputs(onnx_filepath, batch_size)

# Compile and run
engine = compile_model(onnx_filepath, batch_size)
print(engine)
outputs = engine.run(inputs)

I have error:

[nm_ort 7f4a13f2b280 >WARN<  is_supported_graph src/onnxruntime_neuralmagic/supported/ops.cc:199] Warning: Optimized runtime disabled - Detected dynamic input input dim 1. Set inputs to static shapes to enable optimal performance.
deepsparse.engine.Engine:
        onnx_file_path: tts_model.onnx
        batch_size: 4
        num_cores: 10
        num_streams: 1
        scheduler: Scheduler.default
        fraction_of_supported_ops: 0.0
        cpu_avx_type: avx2
        cpu_vnni: False
2024-05-06 17:31:54.756632238 [E:onnxruntime:, sequential_executor.cc:521 ExecuteKernel] Non-zero status code returned while running Gather node. Name:'Gather_token_15' Status Message: indices element out of data bounds, idx=2 must be within the inclusive range [-2,1]
Traceback (most recent call last):
  File "/home/tupk/tupk/nlp/custom/deploy_tts/deepsparse_to_onnx.py", line 12, in <module>
    outputs = engine.run(inputs)
  File "/home/tupk/anaconda3/envs/dl/lib/python3.10/site-packages/deepsparse/engine.py", line 532, in run
    return self._eng_net.execute_list_out(inp)
RuntimeError: NM: error: Non-zero status code returned while running Gather node. Name:'Gather_token_15' Status Message: indices element out of data bounds, idx=2 must be within the inclusive range [-2,1]

How can I use deepsparse?

Thank you very much!

mgoin commented 6 months ago

Hey @phamkhactu from the error message "indices element out of data bounds, idx=2 must be within the inclusive range [-2,1]" it seems that the model is sensitive to the input data provided since it uses that as an index for a Gather operation. Could you try manually making an input of all zeros and passing that in? You could also precisely set the input shapes using the input_shapes= parameter

phamkhactu commented 6 months ago

Hi @mgoin

With your suggestion I tried, but I don't know the way to pass input_shapes because input_shapes list[list[int]].

Here is my onnx model. And my code below procedures my convert to tts_model.onnx

        dummy_input_length = 100
        sequences = torch.randint(low=0, high=2, size=(1, dummy_input_length), dtype=torch.long)
        sequence_lengths = torch.LongTensor([sequences.size(1)])
        scales = torch.FloatTensor([0.68, 1.0, 1.0])
        dummy_input = (sequences, sequence_lengths, scales)
        input_names = ["input", "input_lengths", "scales"]

        speaker_id = torch.LongTensor([0])
        dummy_input += (speaker_id,)
        input_names.append("sid")

        torch.onnx.export(
            model=self,
            args=dummy_input,
            opset_version=15,
            f=output_path,
            verbose=verbose,
            input_names=input_names,
            output_names=["output"],
            dynamic_axes={
                "input": {0: "batch_size", 1: "phonemes"},
                "input_lengths": {0: "batch_size"},
                "output": {0: "batch_size", 1: "time1", 2: "time2"},
            },
        )

I tried to test with deepsparse but getting error:

from deepsparse import compile_model
from deepsparse.utils import generate_random_inputs
import torch
import numpy as np

onnx_filepath = "tts_model.onnx"
batch_size = 1

# Generate random sample input
inputs = generate_random_inputs(onnx_filepath, batch_size)

# Compile and run
engine = compile_model(onnx_filepath, batch_size)
print(engine)

dummy_input_length = 100
sequences = torch.randint(low=0, high=2, size=(1, dummy_input_length), dtype=torch.long).cpu().numpy()
sequence_lengths = np.array([sequences.shape[1]], dtype=np.int64)
scales = np.array([0.68, 1.0, 1.0],dtype=np.float32,)

speaker_id = torch.tensor([0]).cpu().numpy()

outputs = engine.run([[sequences, sequence_lengths, scales, speaker_id]])
ValueError: array batch size of 3 must match the batch size the model was instantiated with 1

I am very happy with your help.