linkedin / Liger-Kernel

Efficient Triton Kernels for LLM Training
https://arxiv.org/pdf/2410.10989
BSD 2-Clause "Simplified" License
3.38k stars 190 forks source link

Refactored benchmark tests #196

Closed shimizust closed 2 months ago

shimizust commented 2 months ago

Summary

Testing Done

image

shimizust commented 2 months ago

can we add a readme in benchmark/ to elaborate how to run these?

Added instructions to contributing.md

shimizust commented 2 months ago

Taking the raw results from triton.testing.do_bench here, we already get the unrounded values. I think the For example you can see in the CSV:

cross_entropy,liger,forward,speed,ms,V,vocab size,8192,0.8101439476013184,0.7565760016441345,0.9144319891929626,"{""B"": 8, ""T"": 2048}",NVIDIA A100-SXM4-80GB,2024-09-03 15:31:39,0.2.1

cc @ByronHsu @austin362667

ByronHsu commented 2 months ago

ah i see! you are writing the csv on your own