Fix remove_padding for higher dimensionality tensors

The current remove_padding functionality was not working for higher dimensionality tensors (ndim > 2). In fact, the test for 3D was passing incidentally because the first and second dimensions of the tested tensor were the same.

The padding_mask only needs to operate over the batch dimension, which by convention, is the dim=0. That means, it always needs to be one-dimensional when applied to the tensor. The current code only reduces one dimension, which means that for higher than 2 dimensions, the mask has more than one dimension, leading to unexpected and often erroneous results.

The PR fixes these issues and adds tests to validate that now the code works. Previous tests are left in to show there is no regression.

Chris-hughes10 / pytorch-accelerated

Fix remove_padding for higher dimensionality tensors #52