sigsep / open-unmix-pytorch

Open-Unmix - Music Source Separation for PyTorch
https://sigsep.github.io/open-unmix/
MIT License
1.24k stars 182 forks source link

Typo in README - input tensor shape of OpenUnmix #84

Closed sevagh closed 3 years ago

sevagh commented 3 years ago

Hello, I believe the true input shape of OpenUnmix (the spectrogram model, not the on-the-fly waveform one) is this, taken from the code:

(nb_samples, nb_channels, nb_bins, nb_frames)

This corresponds to the (I, F, T) that I've seen in the oracle code (I = channels, F = frequency bins, T = time frames).

The README describes the shape in a different order:

models.OpenUnmix: The core open-unmix takes magnitude spectrograms directly (e.g. when pre-computed and loaded from disk). In that case, the input is of shape (nb_frames, nb_samples, nb_channels, nb_bins)

sevagh commented 3 years ago

Huh, in fact the code right after says:

# get current spectrogram shape
nb_frames, nb_samples, nb_channels, nb_bins = x.data.shape

So I think the documentation block of the code might need adjusting. README might be correct?

faroit commented 3 years ago

yes, nb_samples, nb_channels, nb_bins, nb_frames is correct. Feel free to open a PR to correct the readme – we changed this in the recent model.