exPHAT / SwiftWhisper

๐ŸŽค The easiest way to transcribe audio in Swift
MIT License
593 stars 63 forks source link

Add support for word-level timestamps `--word_timestamps` #17

Closed martinlexow closed 1 year ago

martinlexow commented 1 year ago

First of all: Thank you for coding this Swift Package โ€” itโ€™s terrific! ๐Ÿ™

What Iโ€™m missing: Iโ€™d love to get word-level timestamps like mentioned in the Whisper API.

For my understanding this would require that we can set --word_timestamps to true. (Maybe WhisperParams would be a good place for that?)

Keep up the great work!

Best, Martin

martinlexow commented 1 year ago

Just stumbled upon this comment where it says that using --max-len 1 should result in word-level timestamps.

However it seems to make no difference for me when setting parameter.max_len = 1 โ€” I still geht phrases/sentences.

exPHAT commented 1 year ago

Someone had asked about this in https://github.com/exPHAT/SwiftWhisper/issues/6

Let me know if that works for you.

martinlexow commented 1 year ago

This indeed works pretty well โ€” apologies for the duplicate!

parameter.split_on_word = true
parameter.max_len = 1
parameter.token_timestamps = true