usefulsensors / openai-whisper

Robust Speech Recognition via Large-Scale Weak Supervision
MIT License
62 stars 24 forks source link

Running on iOS #4

Open bjnortier opened 1 year ago

bjnortier commented 1 year ago

@nyadla-sys Moving the conversation here from the C++ implementation.

I've made some progress, I have a prototype working on iOS with different bits hacked together. But the CoreML delegate doesn't work. The Interpreter() doesn't initialise when created with the delegate. I'm trying to figure out why.

nyadla-sys commented 1 year ago

have you tried GPU delegate, and also minimum TF version must be greater than >2.4

nyadla-sys commented 1 year ago

However, I've made some effort toward creating a full Int8 model, and its size is approximately 36MB. I'll release my code soon.

bjnortier commented 1 year ago

Yes I've tried GPU delegate but it doesn't load your model. I think it's because the output is int32:

"The Core ML delegate currently supports float (FP32 and FP16) models."

nyadla-sys commented 1 year ago

Thanks @bjnortier for implementing basic iOS app using whisper.tflite

source code for iOS APP development (whisper.tflite hybrid model, 40MB model size) https://github.com/bjnortier/whisper-tflite-ios

bjnortier commented 1 year ago

It’s a very rough prototype but it works

nyadla-sys commented 1 year ago

https://apps.apple.com/in/app/whisper-asr/id6444556326

Pls download iOS app from apple app store which uses whisper tflite model

NickDarvey commented 1 year ago

@nyadla-sys, is whisper-asr based on the bjnortier/whisper-tflite-ios repo?

nyadla-sys commented 1 year ago

Yes and used part of the code

nyadla-sys commented 1 year ago

Whisper ASR iOS app is available now and here is the link https://apps.apple.com/in/app/whisper-asr/id6444556326

ankushg commented 1 year ago

Yes I've tried GPU delegate but it doesn't load your model. I think it's because the output is int32:

"The Core ML delegate currently supports float (FP32 and FP16) models."

Has there been any progress on creating a model that works with the CoreML Delegate?

nyadla-sys commented 1 year ago

@ankushg Please use this notebook to generate fp32 model that is compatible for coreml delegate

jj09 commented 1 year ago

@bjnortier thanks for sample code for iOS app! Is there a way to correlate translated text with timestamps?

bjnortier commented 1 year ago

@jj09 not in the TFLite version and it processes the whole 30 second chunk in one go. Whisper.cpp has better timestamp support.

jj09 commented 1 year ago

@bjnortier are there any samples how to do it with whisper.cpp?

bjnortier commented 1 year ago

Whisper.cpp contains ObjC and SwiftUI examples On 21 Feb 2023 at 04:18 +0200, Jacob Jedryszek @.***>, wrote:

@bjnortier are there any samples how to do it with whisper.cpp? — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>