Closed FindHao closed 3 weeks ago
Cloned from PR https://github.com/pytorch-labs/tritonbench/pull/13 because of merging bot issue Migrated from https://github.com/pytorch/benchmark/pull/2507
Add custom ops fused_linear_cross_entropy,geglu,cross_entropy from liger kernel.
Test Plan:
% python run.py --op fused_linear_cross_entropy,geglu,cross_entropy --num-inputs 1 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:12<00:00, 12.67s/it] x_val LMHeadCE-latency LigerLMHeadCE-latency inductor_fused_linear_cross_entropy-latency ------- ------------------ ----------------------- --------------------------------------------- 0 139.673 533.048 143.575 0%| | 0/1 [00:00<?, ?it/s]/scratch/yhao/pta/pytorch/torch/_inductor/compile_fx.py:182: UserWarning: TensorFloat32 tensor cores for float32 matrix multiplication available but not enabled. Consider setting `torch.set_float32_matmul_precision('high')` for better performance. warnings.warn( 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:07<00:00, 7.65s/it] x_val LlamaMLP-latency LigerGEGLUMLP-latency InductorLlamaMLP-latency ------- ------------------ ----------------------- -------------------------- 0 69.2102 69.6707 69.0965 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:01<00:00, 1.54s/it] x_val CrossEntropyLoss-latency LigerCrossEntropyLoss-latency InductorCrossEntropyLoss-latency ------- -------------------------- ------------------------------- ---------------------------------- 0 0.84 0.40736 0.180896
@FindHao has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.
@FindHao merged this pull request in pytorch-labs/tritonbench@72ddac91274e783ecdf5a12b30747e92d2f95b7b.
Cloned from PR https://github.com/pytorch-labs/tritonbench/pull/13 because of merging bot issue Migrated from https://github.com/pytorch/benchmark/pull/2507
Add custom ops fused_linear_cross_entropy,geglu,cross_entropy from liger kernel.
Test Plan: