alesaccoia / VoiceStreamAI

Near-Realtime audio transcription using self-hosted Whisper and WebSocket in Python/JS
MIT License
741 stars 107 forks source link

Transcribing fragmented audio with codec #32

Closed kenho211 closed 4 months ago

kenho211 commented 5 months ago

Is there way to get the audio from fragmented audio, for example from MPEG-DASH / HLS stream?

alesaccoia commented 5 months ago

I think you'd need to put something in front of VoiceStreamAI to do that

On Thu, 27 Jun 2024 at 18:42, kenho211 @.***> wrote:

Is there way to get the audio from fragmented audio, for example from MPEG-DASH / HLS stream?

— Reply to this email directly, view it on GitHub https://github.com/alesaccoia/VoiceStreamAI/issues/32, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKSEP5Y6NYGCTCPWHGTU3TZJQ6FVAVCNFSM6AAAAABKAJMCPSVHI2DSMVQWIX3LMV43ASLTON2WKOZSGM3TQNRSG4ZDANA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

kenho211 commented 4 months ago

Thank you. I am able to do that by concatenating the init segment with the others, then parse them.

alesaccoia commented 4 months ago

@kenho211 that's a nice use case, if you wanted to contribute to the repo with a minimal demo would be welcome. cheers