alesaccoia / VoiceStreamAI

Near-Realtime audio transcription using self-hosted Whisper and WebSocket in Python/JS
MIT License
555 stars 73 forks source link

audio from video element? #31

Open ROBERT-MCDOWELL opened 1 week ago

ROBERT-MCDOWELL commented 1 week ago

how to use audio chunks from a video element and transcribe it? thanks

alesaccoia commented 1 week ago

Nice use case. Google AI told me to do something like this:

    <div class="controls">
        <!-- ... existing controls ... -->
        <div class="control-group">
            <label class="label" for="videoSource">Video Source:</label>
            <input type="file" id="videoSource" accept="video/*">
        </div>
    </div>
    <video id="myVideo" width="320" height="240" controls></video> <br/>
    <button id="startButton" onclick='startRecording()' disabled>Start Streaming</button>
    <button id="stopButton" onclick='stopRecording()' disabled>Stop Streaming</button>
    <div id="transcription"></div>
<script>
        // ... (Existing code) ... 

        let videoElement = document.getElementById('myVideo');

        // Add event listener for video file selection
        document.getElementById('videoSource').addEventListener('change', function(e) {
            let file = e.target.files[0];
            let fileURL = URL.createObjectURL(file);
            videoElement.src = fileURL;
        });

        function startRecording() {
            if (isRecording) return;
            isRecording = true;

            const AudioContext = window.AudioContext || window.webkitAudioContext;
            context = new AudioContext();

            // Use video element as audio source
            let source = context.createMediaElementSource(videoElement); 
            processor = context.createScriptProcessor(bufferSize, 1, 1);
            processor.onaudioprocess = e => processAudio(e);
            source.connect(processor); 
            processor.connect(context.destination); 

            sendAudioConfig(); 

            // Disable start button and enable stop button
            document.getElementById('startButton').disabled = true;
            document.getElementById('stopButton').disabled = false;
        }

Let me know if that works

ROBERT-MCDOWELL commented 1 week ago

thanks much for your quick answer, for now I'm struggling to make the server side running, I get RuntimeError: CUDA failed with error CUDA driver version is insufficient for CUDA runtime version and I'm investigating to solve it.

ROBERT-MCDOWELL commented 1 week ago

please leave it open so once I will solve my server side I will try your code thank you

alesaccoia commented 1 week ago

good luck, I've been going through CUDA issues all the time as well. You could try the docker image without messing up your local env

ROBERT-MCDOWELL commented 1 week ago

ok thanks