PaddlePaddle / PaddleNLP

πŸ‘‘ Easy-to-use and powerful NLP and LLM library with πŸ€— Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including πŸ—‚Text Classification, πŸ” Neural Search, ❓ Question Answering, ℹ️ Information Extraction, πŸ“„ Document Intelligence, πŸ’Œ Sentiment Analysis etc.
https://paddlenlp.readthedocs.io
Apache License 2.0
11.71k stars 2.86k forks source link

[LLM] Support prefix tuning and lora for qwen2 #8601

Closed DrownFish19 closed 1 week ago

DrownFish19 commented 2 weeks ago

PR types

Function optimization

PR changes

Models

Description

  1. support prefix tuning and lora;
  2. fix modeling and tokenizer when tie_word_embedding=True (Qwen1.5-0.5B, Qwen2-0.5B, Qwen2-1.5B);
  3. fix pipeline and sequence parallel;
  4. add unittest;
  5. add llm unittest;
  6. upload pretrain, sft, and lora configs.
paddle-bot[bot] commented 2 weeks ago

Thanks for your contribution!

codecov[bot] commented 2 weeks ago

Codecov Report

Attention: Patch coverage is 52.32558% with 41 lines in your changes missing coverage. Please review.

Project coverage is 54.73%. Comparing base (970b868) to head (cb19d31).

Files Patch % Lines
paddlenlp/transformers/qwen2/modeling_pp.py 34.37% 21 Missing :warning:
paddlenlp/transformers/qwen2/modeling.py 68.42% 12 Missing :warning:
paddlenlp/transformers/qwen2/tokenizer.py 41.66% 7 Missing :warning:
paddlenlp/transformers/model_utils.py 50.00% 1 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## develop #8601 +/- ## =========================================== + Coverage 54.18% 54.73% +0.55% =========================================== Files 625 625 Lines 98942 98985 +43 =========================================== + Hits 53612 54180 +568 + Misses 45330 44805 -525 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.