Why the input tensor shape is (B,F,T,2)?

Xiaobin-Rong / deepvqe

An unofficial implementation of DeepVQE proposed by Microsoft Corp.

62 stars 19 forks source link

Why the input tensor shape is (B,F,T,2)? #1

Open steven8274 opened 10 months ago

steven8274 commented 10 months ago

Hi, Xiaobin, thanks for implementation for 'DeepVQE'.I'v read the paper and your codes.But I have some questions: Why the input tensor shape is (B,F,T,2)? What does 'B','F','T' mean? Thanks in advance.

Xiaobin-Rong commented 10 months ago

Sorry for the confusion caused by my uncertainty. The input tensor is a batch of noisy spectrograms, where B means the batch size, and F and T refer to frequency bins and time frames, respectively. The final dimension is composed of the real and imaginary parts of the spectrogram.

steven8274 commented 10 months ago

Sorry for the confusion caused by my uncertainty. The input tensor is a batch of noisy spectrograms, where B means the batch size, and F and T refer to frequency bins and time frames, respectively. The final dimension is composed of the real and imaginary parts of the spectrogram.

Thanks!