argmaxinc / WhisperKit

On-device Speech Recognition for Apple Silicon
http://argmaxinc.com/blog/whisperkit
MIT License
3.92k stars 330 forks source link

Background Transcription Support? #194

Closed normand1 closed 3 months ago

normand1 commented 3 months ago

When running the example application I get the following error when I put the app in the background. I don't see it explicitly mentioned anywhere if background transcription is possible or not so I'm just asking the question in case there is some known solution. Thank you!

Error: command buffer exited with error status.
    The Metal Performance Shaders operations encoded on it may not have completed.
    Error: 
    (null)
    Insufficient Permission (to submit GPU work from background) (00000006:kIOGPUCommandBufferCallbackErrorBackgroundExecutionNotPermitted)
    <MTLDebugCommandBuffer: 0x102b7ca00> -> <AGXA14FamilyCommandBuffer: 0x101f872b0>
    label = MelSpectrogram_main__Op0_MpsGraphInference+ 
    device = <AGXA14Device: 0x102a24200>
        name = Apple A14 GPU 
    commandQueue = <AGXA14FamilyCommandQueue: 0x101f77320>
        label = <none> 
        device = <AGXA14Device: 0x102a24200>
            name = Apple A14 GPU 
    retainedReferences = 1
ZachNagengast commented 3 months ago

It's true that CoreML cannot run on GPU in the background - this is due to some OS protections to give the foregrounded app priority for framerate. For iOS in particular, I'd recommend using the ANE compute unit, which is allowed to submit work in the background. You can also limit it to just CPU, but since our models are optimized to ANE, the CPU accuracy may be degraded.

normand1 commented 3 months ago

@ZachNagengast Got it, thank you, I'll look into it!