Closed csukuangfj closed 10 months ago
See also https://github.com/openai/whisper/pull/1761
The difference is that whisper v3 uses 128-dim features, while it uses 80 previously for other kinds of models.
import kaldifeat opts = kaldifeat.WhisperFbankOptions() opts.num_mels = 128 opts.device = torch.device('cuda', 0) fbank = kaldifeat.WhisperFbank(opts) features = fbank(wave)
See also https://github.com/openai/whisper/pull/1761
The difference is that whisper v3 uses 128-dim features, while it uses 80 previously for other kinds of models.
Usage