mlcommons / inference_policies

Issues related to MLPerf™ Inference policies, including rules and suggested changes
https://mlcommons.org/en/groups/inference/
Apache License 2.0
55 stars 52 forks source link

Add and revise rules related to llama2 #287

Closed nvzhihanj closed 7 months ago

nvzhihanj commented 8 months ago
  1. Add missing model details about GPTJ and Llama2
  2. Allow pagedAttention and continuous batching in the LLM rule QA section
  3. Disallow speculative decoding in the rule.
github-actions[bot] commented 8 months ago

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅