onuratakan / gpt-computer-assistant

gpt-4o for windows, macos and linux
MIT License
4.75k stars 441 forks source link

feat: Adding wake word mechanism #118

Closed DawoodTouseef closed 2 weeks ago

DawoodTouseef commented 3 weeks ago

Add Wake word detection mechanism where user can activate the assistant by wake word

onuratakan commented 3 weeks ago

Hi its amazing, realy amazing can you explain more about this function ?

DawoodTouseef commented 3 weeks ago

This function, wake_word, is designed to continuously listen for a specific wake word using the Picovoice Porcupine wake word detection engine. When the wake word is detected, it prints a message and returns True. Below is a detailed explanation of each part of the function.

Function: wake_word Parameters: access_key: str: A string representing the access key required to initialize the Picovoice Porcupine wake word detection engine. Key Components and Workflow: Import Statements:

import pvporcupine
import pyaudio
import struct

pvporcupine: The Porcupine wake word detection library. pyaudio: A library to work with audio input and output. struct: A library to handle binary data and convert it into Python data types. Initialize Porcupine:


porcupine = pvporcupine.create(access_key=access_key, keywords=pvporcupine.KEYWORDS)

This creates an instance of the Porcupine wake word engine using the provided access key and initializes it with default keywords. Initialize PyAudio:


pa = pyaudio.PyAudio()

This initializes the PyAudio library, which is used to capture audio from the microphone. Open an Audio Stream:

audio_stream = pa.open(
    rate=porcupine.sample_rate,
    channels=1,
    format=pyaudio.paInt16,
    input=True,
    frames_per_buffer=porcupine.frame_length
)

This opens an audio stream with the following parameters: rate: The sample rate of the audio stream, which matches Porcupine's sample rate. channels: The number of audio channels (1 for mono). format: The format of the audio stream, pyaudio.paInt16 specifies 16-bit PCM. input: Indicates that the stream is for input (i.e., capturing audio). frames_per_buffer: The number of audio frames per buffer, matching Porcupine's frame length. Listening for Wake Word:

print_("Listening for wake word...", color="yellow")

This prints a message indicating that the system is listening for the wake word. Continuous Listening Loop:

while True:
    pcm = audio_stream.read(porcupine.frame_length)
    pcm = struct.unpack_from("h" * porcupine.frame_length, pcm)

    keyword_index = porcupine.process(pcm)

    if keyword_index >= 0:
        print_("Wake word detected!", color="green")
        return True

This loop continuously reads audio data from the microphone: audio_stream.read(porcupine.frame_length): Reads a chunk of audio data with the length specified by porcupine.frame_length. struct.unpack_from("h" * porcupine.frame_length, pcm): Converts the raw audio data into a tuple of integers representing the PCM audio samples. porcupine.process(pcm): Processes the audio frame to check if the wake word is detected. It returns the index of the detected keyword or -1 if no keyword is detected. If keyword_index is greater than or equal to 0, it indicates the wake word was detected, a message is printed, and the function returns True.

onuratakan commented 2 weeks ago

Thank you for your detailed explaination. Actualy i will add this with a feature type in chat settings.