PaddlePaddle / PaddleNLP

👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Question Answering, ℹ️ Information Extraction, 📄 Document Intelligence, 💌 Sentiment Analysis etc.
https://paddlenlp.readthedocs.io
Apache License 2.0
11.99k stars 2.93k forks source link

Sft flash mask #8664

Closed wtmlon closed 3 months ago

wtmlon commented 3 months ago

PR types

PR changes

Description

适配 flash mask 数据流,支持 SFT,DPO image

paddle-bot[bot] commented 3 months ago

Thanks for your contribution!

codecov[bot] commented 3 months ago

Codecov Report

Attention: Patch coverage is 2.94118% with 33 lines in your changes missing coverage. Please review.

Project coverage is 55.61%. Comparing base (3ebe938) to head (1f6bfd1). Report is 228 commits behind head on develop.

Files with missing lines Patch % Lines
paddlenlp/data/data_collator.py 0.00% 11 Missing :warning:
paddlenlp/trl/trl_data.py 0.00% 10 Missing :warning:
paddlenlp/datasets/zero_padding_dataset.py 0.00% 4 Missing :warning:
paddlenlp/trl/dpo_trainer.py 0.00% 4 Missing :warning:
paddlenlp/transformers/llama/fusion_ops.py 0.00% 2 Missing :warning:
paddlenlp/transformers/llama/modeling.py 33.33% 2 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## develop #8664 +/- ## =========================================== - Coverage 55.62% 55.61% -0.01% =========================================== Files 620 620 Lines 96949 96965 +16 =========================================== + Hits 53929 53930 +1 - Misses 43020 43035 +15 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.