Open twardoch opened 8 months ago
Try uncommenting the n_mels line
whisper_s2t_model = whisper_s2t.load_model(
model_identifier=Config.model_identifier,
backend=Config.backend,
asr_options={"word_timestamps": True},
# n_mels=128 # This doesn't matter
)
By "this doesn't work" I meant: it fails if the parameter is commented or uncommented.
@twardoch this is a bug for the aligner model. By default for alignment tiny model is used which expects n_mels to be of size 80 but large-v3 expects n_mels to be of size 128. Since same pre processor is getting shared, you are getting this issue.
I will fix this in next release.
Meanwhile for using large-v3 disable word timestamps (which should fix your issue):
asr_options={"word_timestamps": False},
Thanks! I do want them wordstamps though ;)
@twardoch You can add a separate preprocessor with a fixed number of n_mels
as shown in this commit
Cell 1
Cell 2
Cell 3
Cell 4
Cell 5
When trying to run this code with
large-v3
model identifier, I keep getting:With
large-v2
, it works fine.