Open jwijffels opened 5 months ago
I can look into testing this today.
I ran it twice on rant1.mp3
with the medium model.
# No CoreML
remotes::install_github("bnosac/audio.whisper", force = TRUE)
model <- audio.whisper::whisper("medium")
trans <- audio.whisper::predict(model, newdata = "output.wav", language = "en", n_threads = 1)
trans$timing
#> 35.22 min
# CoreML
Sys.setenv(WHISPER_COREML = "1")
remotes::install_github("bnosac/audio.whisper", force = TRUE)
model <- audio.whisper::whisper("ggml-medium.bin")
trans1 <- audio.whisper::predict(model, newdata = "output.wav", language = "en", n_threads = 1)
trans1$timing
#> 11.53 min
trans2 <- audio.whisper::predict(model, newdata = "output.wav", language = "en", n_threads = 1)
trans2$timing
#> 10.59 min
That speedup on 1 thread seems to correspond to the 3x speedup mentionned in https://github.com/ggerganov/whisper.cpp?tab=readme-ov-file#core-ml-support The 1min speed gain between the 2 runs is smaller than I thougt after reading that comment.
I would be interested to understand if this now also gives an extra speedup in addition to Metal?
Looks like the PKG_CPPFLAGS += -fobjc-arc
at https://github.com/bnosac/audio.whisper/blob/master/src/Makevars#L117 needed for compiling CoreML is incompatible with the compilation of metal.m
That means I'll have to split the targets in the Makevars to allow Metal together with CoreML.
Should be enabled with this commit: https://github.com/bnosac/audio.whisper/commit/f81a8179870c786afadc9ce3b221733d0a6ba8e4
Install with
As shown at https://github.com/ggerganov/whisper.cpp?tab=readme-ov-file#core-ml-support Get one of these coreml models from https://huggingface.co/ggerganov/whisper.cpp/tree/d15393806e24a74f60827e23e986f0c10750b358 unzip and put it at the same path as the non-coreml model such that you have these in your working directory
provide the path to non-coreml model
model <- whisper("ggml-base.en.bin")
From the README at whisper.cpp: