linkedin / Liger-Kernel

Efficient Triton Kernels for LLM Training
https://arxiv.org/pdf/2410.10989
BSD 2-Clause "Simplified" License
3.64k stars 216 forks source link

PreferenceBase with Softcapping? #426

Open cinjon opened 9 hours ago

cinjon commented 9 hours ago

🚀 The feature, motivation and pitch

There's softcapping in the FusedLinearCrossEntropy; it would be nice to have this natively for PreferenceBase too.

Alternatives

No response

Additional context

No response

ByronHsu commented 8 hours ago

Are you willing to contribute? The change should be starter friendly