PaddlePaddle / PaddleNLP

👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Question Answering, ℹ️ Information Extraction, 📄 Document Intelligence, 💌 Sentiment Analysis etc.
https://paddlenlp.readthedocs.io
Apache License 2.0
11.99k stars 2.93k forks source link

[cpu]llama avx model inference supports #8634

Closed bukejiyu closed 3 months ago

bukejiyu commented 3 months ago

PR types

PR changes

Description

paddle inference_mode 集成xft cpu kernel 机器8463B 输入/输出 128/15 bs=1 静态图llama 测速 next_tokens: 100+ms 48线程 动态图llama 测速 next_tokens: 70+ms

paddle-bot[bot] commented 3 months ago

Thanks for your contribution!

CLAassistant commented 3 months ago

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


395822456@qq.com seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

codecov[bot] commented 3 months ago

Codecov Report

Attention: Patch coverage is 0% with 282 lines in your changes missing coverage. Please review.

Project coverage is 55.63%. Comparing base (65e721e) to head (b791375). Report is 243 commits behind head on develop.

Files with missing lines Patch % Lines
...dlenlp/experimental/transformers/llama/modeling.py 0.00% 133 Missing :warning:
...enlp/experimental/transformers/generation_utils.py 0.00% 101 Missing :warning:
...erimental/transformers/fused_transformer_layers.py 0.00% 48 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## develop #8634 +/- ## =========================================== - Coverage 55.80% 55.63% -0.18% =========================================== Files 620 620 Lines 96642 96940 +298 =========================================== Hits 53928 53928 - Misses 42714 43012 +298 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.