ONNX Runtime with mistral support and memory profiling

premAI-io / benchmarks

🕹️ Performance Comparison of MLOps Engines, Frameworks, and Languages on Mainstream AI Models.

MIT License

117 stars 5 forks source link

ONNX Runtime with mistral support and memory profiling #182

Closed Anindyadeep closed 2 months ago

Anindyadeep commented 2 months ago

This PR introduces all the changes by PR https://github.com/premAI-io/benchmarks/pull/167 and integrates those for ONNX Runtime with HF Optimum. ONNX Runtime LLM README now has quality checks table for both Llama 2 Chat and Mistral Instruct.

Anindyadeep commented 2 months ago

LGTM, @Anindyadeep just solve the conflicts

Done