Mistral AWQ with memory profiling

premAI-io / benchmarks

🕹️ Performance Comparison of MLOps Engines, Frameworks, and Languages on Mainstream AI Models.

MIT License

130 stars 5 forks source link

Closed Anindyadeep closed 5 months ago

Anindyadeep commented 5 months ago

This PR introduces all the changes by PR #167 and integrates those in AWQ. AWQ README now has quality checks table for both Llama 2 and Mistral

Anindyadeep commented 5 months ago

For the docs/llama2.md.template, I added an initial draft on the numbers, lemme know if that looks or not, or we can discuss on this.

Anindyadeep commented 5 months ago

@Anindyadeep let's merge before PR #160 into main and then merge main into this branch so that I check only the AWQ changes

Yes that works