For rebatching stbe length without output, it must be 2d tensor in the shape of [F x B] and we can directly concat at dim1;
For MTIA inference, if stbe is in remote, its output will be padded to max batch size, which will make split not work. In this case, we want to remove the padding and restore its original size.
Summary:
Differential Revision: D64914077