Add jagged_sum operator for unpadded nested tensors to TritonBench

jananisriram commented 2 weeks ago

Summary: Add a jagged_sum reduction operator for unpadded nested tensors, based on the PyTorch sum operator, to TritonBench. This diff implements a basic benchmark for reducing along the ragged dimension for 3-dimensional nested tensors. For a 3-dimensional tensor of shape (B, *, M), where * is the ragged dimension, this benchmark uses PyTorch's sum operator to reduce B (*, M) 2-dimensional tensors to a (B, M) output tensor.

Measure performance of basic benchmark with gbps and latency metrics and display nested tensor parameters B and M.

Reviewed By: YuqingJ

Differential Revision: D58396957

facebook-github-bot commented 2 weeks ago

This pull request was exported from Phabricator. Differential Revision: D58396957

facebook-github-bot commented 2 weeks ago

This pull request has been merged in pytorch/benchmark@576b2b29f97db06fa345285e78dfc144d87735c2.

pytorch / benchmark

Add jagged_sum operator for unpadded nested tensors to TritonBench #2299