argmaxinc / WhisperKit

On-device Speech Recognition for Apple Silicon
https://takeargmax.com/blog/whisperkit
MIT License
3.17k stars 268 forks source link

Specialization takes a really long time #17

Closed yych42 closed 7 months ago

yych42 commented 7 months ago

I'm trying the demo app on a MacBook Pro with Apple M1 Pro and 16 GB memory. The large-v3_turbo_1049MB model has been specializing for more than 30 minutes, but aned is still running and using a whole performance core. Have you guys tested the loading time on different devices?

atiorh commented 7 months ago

Hi @yych42, It is a known issue with ANE compiler to take a very long time for turbo variants on A14 and M1 chips. We recently disabled turbo variants on devices with these chip generations but haven't updated the example app yet. In the interim, we recommend using regular large-v3.

ZachNagengast commented 7 months ago

Following up here, there were a few updates in #20 that should help with this, but specializing is still a hard requirement from Apple that we don't have much control over via CoreML.