neuralmagic / deepsparse

Sparsity-aware deep learning inference runtime for CPUs
https://neuralmagic.com/deepsparse/
Other
2.94k stars 169 forks source link

[Fix] ONNX model benchmarking when no sequence_length inferred from the model #1581

Closed dbogunowicz closed 5 months ago

dbogunowicz commented 5 months ago

Fix for: https://app.asana.com/0/1206109050183159/1206490681470513/f

The model zoo:llama2-7b-open_platypus_orca_llama2_pretrain-pruned40_quantized does not have the sequence length set internally in the onnx model (@alexm-nm is that something that we expect in the first place)? Because we cannot deduce this parameter, we are crashing with a very obscure message. This PR adds more verbose termination, that guides the user towards the right action (providing the sequence length manually)