argmaxinc / WhisperKit

On-device Speech Recognition for Apple Silicon
https://takeargmax.com/blog/whisperkit
MIT License
3.17k stars 267 forks source link

Draft for reading files in chunks #122

Closed Abhinay1997 closed 3 months ago

Abhinay1997 commented 5 months ago

TODO:

From Zack:

The tricky part is keeping track of timestamps as we iterate through, there can be drift If we just do every 30s we’re going to be missing some text because the model only transcribes up to a good breaking point, and assumes we will seek to that point and run the next 30s Yea we just seek to them, which is where you could insert your method, instead of seeking, you load the next chunk and start at 0 time again

Abhinay1997 commented 5 months ago

Putting on hold in favour of #125