linkedin / Liger-Kernel

Efficient Triton Kernels for LLM Training
https://arxiv.org/pdf/2410.10989
BSD 2-Clause "Simplified" License
3.66k stars 217 forks source link

PreferenceBase with Softcapping? #426

Open cinjon opened 14 hours ago

cinjon commented 14 hours ago

🚀 The feature, motivation and pitch

There's softcapping in the FusedLinearCrossEntropy; it would be nice to have this natively for PreferenceBase too.

Alternatives

No response

Additional context

No response

ByronHsu commented 13 hours ago

Are you willing to contribute? The change should be starter friendly