linkedin / Liger-Kernel

Efficient Triton Kernels for LLM Training
https://arxiv.org/pdf/2410.10989
BSD 2-Clause "Simplified" License
3.67k stars 218 forks source link

PreferenceBase with Softcapping? #426

Open cinjon opened 23 hours ago

cinjon commented 23 hours ago

🚀 The feature, motivation and pitch

There's softcapping in the FusedLinearCrossEntropy; it would be nice to have this natively for PreferenceBase too.

Alternatives

No response

Additional context

No response

ByronHsu commented 22 hours ago

Are you willing to contribute? The change should be starter friendly