NVIDIA / NeMo-Aligner

Scalable toolkit for efficient model alignment
Apache License 2.0
620 stars 78 forks source link

Cherry pick feat: Adds Rejection Sampling Algorithm #329

Closed terrykong closed 1 month ago

terrykong commented 1 month ago

What does this PR do ?

This is a manual cherry-pick since the automation did not seem to create a PR.

@ko3n1g could you help look into why this is when you get a chance?

Changelog

Usage

# Add a code snippet demonstrating how to use this 

Before your PR is "Ready for review"

Pre checks:

Checklist when contributing a new algorithm

Additional Information