NVIDIA / NeMo-Aligner

Scalable toolkit for efficient model alignment

Apache License 2.0

628 stars 78 forks source link

feat: add knowledge distillation support for SFT #348

Closed ashors1 closed 2 weeks ago

ashors1 commented 1 month ago

What does this PR do ?

Add a one line overview of what this PR aims to accomplish.

Changelog

Please update the CHANGELOG.md under next version with high level changes in this PR.

Usage

You can potentially add a usage example below

# Add a code snippet demonstrating how to use this

Before your PR is "Ready for review"

Pre checks:

[ ] Make sure you read and followed Contributor guidelines
[ ] Did you write any new necessary tests?
[ ] Did you add or update any necessary documentation? Make sure to also update the NeMo Framework User Guide which contains the tutorials

Checklist when contributing a new algorithm

[ ] Does the trainer resume and restore model state all states?
[ ] Does the trainer support all parallelism techniques(PP, TP, DP)?
[ ] Does the trainer support max_steps=-1 and validation?
[ ] Does the trainer only call APIs defined in alignable_interface.py?
[ ] Does the trainer have proper logging?

Additional Information

Related to # (issue)