Summary:
Add a jagged_sum reduction operator for unpadded nested tensors, based on the PyTorch sum operator, to TritonBench. This diff implements a basic benchmark for reducing along the ragged dimension for 3-dimensional nested tensors. For a 3-dimensional tensor of shape (B, *, M), where * is the ragged dimension, this benchmark uses PyTorch's sum operator to reduce B(*, M) 2-dimensional tensors to a (B, M) output tensor.
Measure performance of basic benchmark with gbps and latency metrics and display nested tensor parameters B and M.
Summary: Add a
jagged_sum
reduction operator for unpadded nested tensors, based on the PyTorchsum
operator, to TritonBench. This diff implements a basic benchmark for reducing along the ragged dimension for 3-dimensional nested tensors. For a 3-dimensional tensor of shape(B, *, M)
, where*
is the ragged dimension, this benchmark uses PyTorch'ssum
operator to reduceB
(*, M)
2-dimensional tensors to a(B, M)
output tensor.Measure performance of basic benchmark with
gbps
andlatency
metrics and display nested tensor parametersB
andM
.Reviewed By: YuqingJ
Differential Revision: D58396957