haesleinhuepf / faster-whisper-stt

Simple and basic python function for transcribing audio in realtime based on faster-whisper
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

refactoring #2

Closed haesleinhuepf closed 2 weeks ago

haesleinhuepf commented 2 weeks ago

We should take all the functions out of https://github.com/haesleinhuepf/faster-whisper-stt/blob/main/continuous_listening.ipynb and put them into a utilities.py file. We can then update the notebook continuous_listening.ipynb to import those functions.

haesleinhuepf commented 2 weeks ago

git-bob think about this. How would you approach this?

github-actions[bot] commented 2 weeks ago

This message was generated by git-bob (version: 0.2.3, model: gpt-4o-2024-08-06), an experimental AI-based assistant. It can make mistakes and has limitations. Check its messages carefully.

To refactor and modularize the code from continuous_listening.ipynb, you should create a new file named utilities.py and move the functions from the notebook into it. Here's a step-by-step approach:

  1. Create utilities.py: Create a new file named utilities.py in your project directory.

  2. Move Functions: Transfer the following functions from the Jupyter Notebook continuous_listening.ipynb to utilities.py:

    • def audio_callback(indata, frames, time, status):
    • def start_recording(b=None):
    • def interrupt_recording(b=None):
    • def transcribe_audio(audio_path):
    • def reset_transcription(b):
  3. Update Imports: In the continuous_listening.ipynb, import these functions at the beginning of the notebook:

    from utilities import audio_callback, start_recording, interrupt_recording, transcribe_audio, reset_transcription
  4. Maintain Stream Initialization: Keep the initialization of stream variable within the notebook as it depends on the audio_callback. Ensure any dependencies required for these functions are also imported in utilities.py.

This will help keep the notebook cleaner and make the functions reusable.

haesleinhuepf commented 2 weeks ago

git-bob implement this!

haesleinhuepf commented 2 weeks ago

Make sure that the content in the functions in utilties.py is copied over from the continuous_listening.ipynb !

git-bob implement this

haesleinhuepf commented 2 weeks ago

Can you tell me what's the code in continuous_listening.ipynb in the functions?

git-bob think about this.

github-actions[bot] commented 2 weeks ago

This message was generated by git-bob (version: 0.2.3, model: gpt-4o-2024-08-06), an experimental AI-based assistant. It can make mistakes and has limitations. Check its messages carefully.

To refactor as suggested, follow these steps:

  1. Create utilities.py File:
# utilities.py

import sounddevice as sd
import numpy as np
import queue
from scipy.io.wavfile import write
from faster_whisper import WhisperModel

audio_queue = queue.Queue()
sample_rate = 16000

def audio_callback(indata, frames, time, status):
    """This function is called by sounddevice during audio recording."""
    audio_queue.put(indata.copy())

stream = sd.InputStream(callback=audio_callback, samplerate=sample_rate, channels=1, dtype='int16')

def start_recording(b=None):
    """Starts audio recording."""
    audio_queue.queue.clear()  # Clears the queue before recording
    stream.start()

def interrupt_recording(b=None, audio_data=[], my_text=None):
    """Stops audio recording and starts transcription."""
    stream.stop()

    if len(audio_data) > 20:
        audio_data = audio_data[-20:]

    while not audio_queue.empty():
        audio_data.append(audio_queue.get())
    start_recording()
    if audio_data:
        audio_np = np.concatenate(audio_data, axis=0)
        write('output.wav', sample_rate, audio_np.astype(np.int16))  # Writes WAV file
        transcribe_audio('output.wav', my_text)

def transcribe_audio(audio_path, my_text):
    """Transcribes the recorded audio."""
    model = WhisperModel("small", device="cpu")
    segments, _ = model.transcribe(audio_path, language="de", word_timestamps=True)
    transcription = " ".join([segment.text for segment in segments])
    my_text.value = transcription

def reset_transcription(b, text_output):
    """Resets the transcription."""
    text_output.value = ""
  1. Update continuous_listening.ipynb:

Replace the function definitions and imports:

from ipywidgets import Textarea

from utilities import audio_callback, start_recording, interrupt_recording, transcribe_audio, reset_transcription

audio_data = []
my_text = Textarea()
haesleinhuepf commented 2 weeks ago

Yes, that sounds great!

git-bob implement it like this!