bytedance / music_source_separation

Other
1.26k stars 194 forks source link

Why wrong code can work? #32

Open poor1017 opened 2 years ago

poor1017 commented 2 years ago

Hi @qiuqiangkong

I notice you modified the code of resunet_subbandtime.py a couple of days ago.

Now the code is:

separated_subband_audio = torch.stack(
    [
        self.feature_maps_to_wav(
            input_tensor=x[:, j :: self.subbands_num, :, :],
            sp=mag[:, j :: self.subbands_num, :, :],
            sin_in=sin_in[:, j :: self.subbands_num, :, :],
            cos_in=cos_in[:, j :: self.subbands_num, :, :],
            audio_length=audio_length,
        )
        for j in range(self.subbands_num)
    ],
    dim=2,
)

I think this change is for pqmf.synthesis compatible.

The old code is:

separated_subband_audio = torch.cat(
    [
        self.feature_maps_to_wav(
            input_tensor=net_output[:, j * C1 : (j + 1) * C1, :, :],
            sp=mag[:, j * C2 : (j + 1) * C2, :, :],
            sin_in=sin_in[:, j * C2 : (j + 1) * C2, :, :],
            cos_in=cos_in[:, j * C2 : (j + 1) * C2, :, :],
            audio_length=audio_length,
        )
        for j in range(self.subbands_num)
    ],
    dim=1,
)

The old code is not compatible with pqmf.synthesis. pqmf.synthesis assumes subband first, while the resulting tensor generated by the old code is channel first.

It is strange that the old code still works, and can get pretty good model. Do you know the reason?

Thanks