Open songlairui opened 7 months ago
Thank you for the review. Yep, lighting-whisper-mlx
is batching multiple chunks of audio at once. The more memory you have, the faster it will go!
Thank you for the review. Yep,
lighting-whisper-mlx
is batching multiple chunks of audio at once. The more memory you have, the faster it will go!
is there any relationship with Core ML? I ever test the whisper.cpp with Core ML, it consume more RAM than the simple whisper.cpp
Device:
Macbook Pro 14 m1 pro 16G RAM
Input:
408s wav file language: zh (unable use distil model)
Test
whisper.cpp
models/ggml-large.bin (default large is large-v3) -l zh Result: 100.37 s
lightning-whisper-mlx
model="large-v3", batch_size=4, quant="4bit" ( large batch_size will freeze my device)
Result: 58.83 s
Here is the RAM payload
Conclusion
Based on the test results, the lightning-whisper-mlx model with "large-v3" configuration demonstrates a clear advantage in processing speed, completing the task in approximately half the time compared to whisper.cpp. However, this speed comes at the cost of higher RAM usage, which could be a limiting factor for systems with less available memory. Therefore, for users with constrained RAM but sufficient processing power, whisper.cpp may be a more viable option. Conversely, for those with ample RAM and a need for faster processing times, lightning-whisper-mlx is the recommended choice.