Open peggyxpxu opened 4 months ago
why pad_or_trim use 1000 rather than 3000 when transcribe_audio? mel = pad_or_trim(mel, 1000).to(model.device).to(dtype)
mel = pad_or_trim(mel, 1000).to(model.device).to(dtype)
oh that is because most of our data is 10s, so it is just to save some compute.
-Yuan
why pad_or_trim use 1000 rather than 3000 when transcribe_audio?
mel = pad_or_trim(mel, 1000).to(model.device).to(dtype)