Choose OS
and select the Raspberry Pi OS (64-bit) or Ubuntu 22.04.2 LTS (64-bit) .Choose Storage
, select the SD card.Write
and wait for the imaging to complete.The conversational speaker uses Azure Cognitive Service for speech-to-text and text-to-speech. Below are the steps to create an Azure account and an instance of Azure Cognitive Services.
Try Azure for Free
.Start Free
to start creating a free Azure account.NOTE: Even though this is a free account, Azure still requires credit card information. You will not be charged unless you change settings later.
Cognitive Services
. Under Marketplace
select Cognitive Services
. (It may take a few seconds to populate.)Resource Group
select Create New
. Enter a resource group name (e.g. conv-speak-rg
).my-conv-speak-cog-001
).
NOTE: EastUS, WestEurope, or SoutheastAsia are recommended, as those regions tend to support the greatest number of features.
Review + Create
. After validation passes, click Create
.Go to resource
to view your Azure Cognitive Services resource.Resourse Management
, select Keys and Endpoint
.Copy either of the two Cognitive Services keys. Save this key in a secure location for later.
Windows 11 users: If the application is stalling when calling the text-to-speech API, make sure you have applied all current security updates (link).
The conversational speaker uses OpenAI's models to hold a friendly conversation. Below are the steps to create a new account and access the AI models. Supports OpenAI official API or Azure OpenAI API, just choose one.
Sign up
.
NOTE: can use a Google account, Microsoft account, or email to create a new account.
NOTE: If you are new to OpenAI, please review the usage guidelines (https://beta.openai.com/docs/usage-guidelines).
View API keys
.Click + Create new secret key
. Copy the generated key and save it in a secure location for later.
If you are curious to play with the large language models directly, check out the https://platform.openai.com/playground?mode=chat at the top of the page after logging in to https://aka.ms/maker/openai.
Choose between OpenAI official account or Azure OpenAI account
- Create an Azure Account
- If you don't have an Azure account, go to the Azure official website to sign up for an account. Azure offers a free account option, and new users can get a certain amount of free credits for testing and learning.
- Apply for Access
- On the Azure OpenAI service page, click the "Apply for Access" button. This will take you to the application page where you need to fill in some necessary information, including your company name, use case, etc.
- Configure and Use
- Once you have access, you can create a new OpenAI service resource in the Azure portal. After creation, you can get the API key and start using the Azure OpenAI service following the official documentation.
The Code
1. Code Configuration
- The Python Speech SDK package is available for Windows (x64 and x86), Mac x64 (macOS X version 10.14 or later), Mac arm64 (macOS version 11.0 or later), and Linux
- On the Raspberry Pi or your PC, open a command-line terminal.
- On Ubuntu or Debian, run the following commands for the installation of required packages:
sudo apt-get update sudo apt-get install libssl-dev libasound2
- On Ubuntu 22.04 LTS it is also required to download and install the latest libssl1.1 package e.g. from http://security.ubuntu.com/ubuntu/pool/main/o/openssl/.
- Clone the repo.
git clone https://github.com/jackwuwei/gptspeaker.git
- Set your API keys: Replace config.json
{AzureCognitiveServices.Key}
and{AzureCognitiveServices.Region}
with your OpenAI API key and{OpenAI.Key}
with your OpenAI API key.{ "AzureCognitiveServices": { "Key": "AzureCognitiveServicesKey", "Region": "AzureCognitiveServicesRegion", },
"OpenAI": {
"Key": "OpenAIKey",
},
// Just choose one of the two OpenAI above
"AzureOpenAI":
{
"Key": "", // Key 1 or Key 2
"api_version": "2024-02-01",
"Endpoint": "", // Endpoint
"Model": "" // Azure AI Studio deployment name
} }
1. Install requirements
```bash
pip3 -r install requirements.txt
python3 gptspeaker.py
The code base has a default wake phrase ("Hey GPT"
) already, which I suggest you use first. If you want to create your own (free!) custom wake word, then follow the steps below.
.table
file and copy it to source root directory.config.json
file to include your wake phrase file in the build.
"AzureCognitiveServices": {
"WakePhraseModel": "xxx.table",
"WakeWord": "xxx",
}