Adding perceptual weighting to `SumAndDifferenceSTFTLoss`

This should enable the use of simple A-weighting as a pre-filtering process before computing the sum and difference signals.

Example usage:

target = torch.rand(8, 2, 44100)
pred = torch.rand(8, 2, 44100)
loss_fn = auraloss.freq.SumAndDifferenceSTFTLoss(
      fft_sizes=[1024, 2048, 8192],
      hop_sizes=[256, 512, 2048],
      win_lengths=[1024, 2048, 8192],
      perceptual_weighting=True,
      sample_rate=44100,
      scale="mel",
      n_bins=128"
)
res = loss_fn(pred, target)

Notes:

The sample_rate parameter must be supplied when perceptual_weighting = True
This module requires that pred and target are batched stereo tensors of shape (batch size, 2, seq_len)
This creates a breaking change since we are removing the default values for fft_sizes, hop_sizes, and win_lens. This is to reduce potential errors by using the default values, which may not be optimal for all audio sampling rates.

csteinmetz1 / auraloss

Adding perceptual weighting to `SumAndDifferenceSTFTLoss` #48