Format benchmark function names and change x_val to corresponding input shapes

FindHao commented 3 weeks ago

Fix https://github.com/pytorch-labs/tritonbench/issues/31 Test Plan:

% python run.py --op fused_linear_cross_entropy --num-inputs 1 --metrics latency
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:13<00:00, 13.02s/it]
    (B*T, H)    torch_lm_head_ce-latency    liger_lm_head_ce-latency    inductor_fused_linear_cross_entropy-latency
------------  --------------------------  --------------------------  ---------------------------------------------
(4096, 4096)                     145.728                     526.446                                        144.567

facebook-github-bot commented 3 weeks ago

@FindHao has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

FindHao commented 3 weeks ago

LGTM, it is surprising the liger is so much slower than the baseline, should we report this number to liger repo?

oh. For this kernel, it is expected since it is for memory usage optimization.

facebook-github-bot commented 3 weeks ago

@FindHao merged this pull request in pytorch-labs/tritonbench@dcefed3a7bfacb7564334a063b5b81444b0815db.

pytorch-labs / tritonbench

Format benchmark function names and change x_val to corresponding input shapes #35