linkedin / Liger-Kernel

Efficient Triton Kernels for LLM Training
https://arxiv.org/pdf/2410.10989
BSD 2-Clause "Simplified" License
3.67k stars 218 forks source link

PreferenceBase with Softcapping? #426

Open cinjon opened 1 day ago

cinjon commented 1 day ago

🚀 The feature, motivation and pitch

There's softcapping in the FusedLinearCrossEntropy; it would be nice to have this natively for PreferenceBase too.

Alternatives

No response

Additional context

No response

ByronHsu commented 1 day ago

Are you willing to contribute? The change should be starter friendly