Closed jiqing-feng closed 1 month ago
Enable QKV concat linear in llama which brings 10% speed-up in CPU
Enable QKV concat linear in llama which brings 10% speed-up in CPU