Text-To-Speech (TTS) support for Azure OpenAI

quintesse commented 6 days ago

Is your feature request related to a problem? Please describe. The OpenAI "tts" and "tts-hd" models are now available in Azure but the API is not supported by the "azure/openai" module (there is STT but not TTS).

Describe the solution you'd like Add support for the OpenAI speech API (see https://platform.openai.com/docs/api-reference/audio/createSpeech)

Describe alternatives you've considered The only alternative I see is to write our own node wrapper around the REST API.

github-actions[bot] commented 6 days ago

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @glharper.

deyaaeldeen commented 5 days ago

Hi @quintesse we recommend that you migrate to use our newest Azure OpenAI client, exported by openai@4.42.0. You can invoke the TTS method in NodeJS as follows:

import 'openai/shims/node';
import { AzureOpenAI } from 'openai';
import { getBearerTokenProvider, DefaultAzureCredential } from '@azure/identity';
import fs from 'fs';
import path from 'path';

// Corresponds to your Model deployment within your OpenAI resource
// Navigate to the Azure OpenAI Studio to deploy a model.
const deployment = 'tts-1';
const apiVersion = "2024-05-01-preview";
const speechFile = path.resolve(__dirname, './speech.mp3');

const credential = new DefaultAzureCredential();
const scope = 'https://cognitiveservices.azure.com/.default';
const azureADTokenProvider = getBearerTokenProvider(credential, scope);

// Make sure to set the AZURE_OPENAI_ENDPOINT environment variable with the endpoint of your Azure resource. You can find it in the Azure Portal.
const openai = new AzureOpenAI({ azureADTokenProvider, deployment, apiVersion });

async function main() {

  const response = await openai.audio.speech.create({
    model: '',
    voice: 'alloy',
    input: 'the quick brown chicken jumped over the lazy dogs',
  });

  const stream = response.body;
  console.log(`Streaming response to ${speechFile}`);
  await streamToFile(stream, speechFile);
  console.log('Finished streaming');
}

async function streamToFile(stream: NodeJS.ReadableStream, path: fs.PathLike) {
  return new Promise((resolve, reject) => {
    const writeStream = fs.createWriteStream(path).on('error', reject).on('finish', resolve);

    // If you don't see a `stream.pipe` method and you're using Node you might need to add `import 'openai/shims/node'` at the top of your entrypoint file.
    stream.pipe(writeStream).on('error', (error) => {
      writeStream.close();
      reject(error);
    });
  });
}

main();

github-actions[bot] commented 5 days ago

Hi @quintesse. Thank you for opening this issue and giving us the opportunity to assist. We believe that this has been addressed. If you feel that further discussion is needed, please add a comment with the text "/unresolve" to remove the "issue-addressed" label and continue the conversation.

quintesse commented 5 days ago

Thanks @deyaaeldeen ! I didn't know the Azure API had been merged into the official openai module.

Azure / azure-sdk-for-js

Text-To-Speech (TTS) support for Azure OpenAI #30284