linkedin / Liger-Kernel

Efficient Triton Kernels for LLM Training
https://arxiv.org/pdf/2410.10989
BSD 2-Clause "Simplified" License
3.65k stars 217 forks source link

PreferenceBase with Softcapping? #426

Open cinjon opened 12 hours ago

cinjon commented 12 hours ago

🚀 The feature, motivation and pitch

There's softcapping in the FusedLinearCrossEntropy; it would be nice to have this natively for PreferenceBase too.

Alternatives

No response

Additional context

No response

ByronHsu commented 11 hours ago

Are you willing to contribute? The change should be starter friendly