Closed zhaoyanpeng closed 2 years ago
Hi there,
Flac audios can be multi-channel, however, we only use single channels information (our AudioSet data is single-channel). If you have multi-channel audio, you can just use the first channel, which can be simply done by
waveform, sr = torchaudio.load(filename)
waveform = waveform[0, :]
-Yuan
thanks for the reply. that is what I am doing. I am wondering what you did to get single-channel flac audios.
... it looks your flac audios are always single-channel from the code. I just wondering how come. thanks.
Yes, the data I have are all single-channel. It was not me who downloaded the data, so I am not clear on how exactly it was done. But I am quite sure that the single-channel audios were achieved by a naive method like the sample code I showed above (i.e., no beamforming is used), so I don't think that is an important thing.
To use our pretrained model, I think 16kHz single-channel audio with .wav or .flac format should both work - normalization needs to be taken care of if the scale of your data is different.
hi yuan, would it be better to elaborate on how to ensure that flac audios are single-channel?