pytorch / torchtitan

A native PyTorch Library for large model training
BSD 3-Clause "New" or "Revised" License
2.59k stars 204 forks source link

[Triton] Implement Liger Kernels #623

Open casper-hansen opened 3 weeks ago

casper-hansen commented 3 weeks ago

Consider implementing the Liger Kernels which has shown to yield large memory savings.

Benchmark: image

https://github.com/linkedin/Liger-Kernel

gnadathur commented 1 week ago

cc: @lessw2020 who is looking at this.