react-native-voicebox-speech-rec

React Native Voicebox Speech Recognition: A powerful speech recognition library for React Native applications, enabling real-time speech-to-text transcription.

📱 Cross-platform: Consistent and reliable performance on both iOS and Android.
🌐 Driven by the SpeechRecognition interface from the Web Speech API.

Versions

1.0.0
iOS support
Android support
Sample App

Features

Real-time Transcription: Convert speech to text in real-time as the user speaks.
Custom Handlers: Easily setup callback handlers for start, end, and error events during speech recognition.
Language Flexibility: Set a specific language for recognition or default to the user's device system language.
No Cloud Dependency: Leverage the native speech recognition engines on iOS and Android without relying on external cloud services.
Universal Language Support: Compatible with most languages.

Installation

npm install react-native-voicebox-speech-rec --save

yarn add react-native-voicebox-speech-rec

Then link the iOS package by running:

cd ios && pod install && cd ..

npx pod-install

Please make sure AndroidX is enabled in your React Native project by editing android/gradle.properties and adding the following 2 lines if they are not there yet:

android.useAndroidX=true
android.enableJetifier=true

Linking

Please use React Native version >=0.60 . React native will automatically link the library to your project.

Development Notes

📱 iOS Simulator Limitation: The simulator simulates voice input with repeated "Test" phrases and does not utilize the actual microphone input. For genuine speech recognition testing, use a physical iPhone.

🤖 Android Emulator: No specific issues noted. Just remember to enable to use the host audio input for emulator everytime (shown below).

SpeechRec

Usage

Just two steps:

In App.jsx or App.tsx in your React Native project, wrap your components with SpeechRecognitionRootView from react-native-voicebox-speech-rec

import React from 'react';
import { PaperProvider } from 'react-native-paper';
import { ConversationPage } from './conversation/ConversationPage';
import { SpeechRecognitionRootView } from 'react-native-voicebox-speech-rec';
import { SafeAreaProvider } from 'react-native-safe-area-context';

function App(): JSX.Element {
  return (
    <SafeAreaProvider>
      <PaperProvider>
        <SpeechRecognitionRootView>
          <ConversationPage />
        </SpeechRecognitionRootView>
      </PaperProvider>
    </SafeAreaProvider>
  );
}
export default App;

In the file where you want to use speech recognition (e.g. ConversationPage in code example above), use the useSpeechRecognition hook to get the APIs and use them.

import React, { useCallback, useEffect, useMemo } from 'react';
import { ScrollView, Alert } from 'react-native';
import { Text } from 'react-native-paper';
import { SafeAreaView } from 'react-native-safe-area-context';
import { MicrophoneButton } from './MicrophoneButton';
import { useSpeechRecognition } from 'react-native-voicebox-speech-rec';

export const ConversationPage = React.memo(() => {
  /** ***************************************************************
   * Speech Recognition APIs from useSpeechRecognition hook
   *************************************************************** */
  const {
    startSpeechRecognition,
    stopSpeechRecognition,
    speechContentRealTime,
    setSpeechRecErrorHandler,
    setSpeechRecStartedHandler,
    setSpeechRecCompletedHandler,
  } = useSpeechRecognition();

  // Event handler when speech recognition starts
  useEffect(() => {
    setSpeechRecStartedHandler(() => {
      console.log('👆 Speech Recgnition Started!');
    });
  }, [setSpeechRecStartedHandler]);

  // Event handler when speech recognition has errors
  useEffect(() => {
    setSpeechRecErrorHandler((errorMessage: any) => {
      Alert.alert(
        'Error in speech recognition',
        String(errorMessage),
        [
          {
            text: 'OK',
            style: 'cancel',
          },
        ],
        { cancelable: false }
      );
    });
  }, [setSpeechRecErrorHandler]);

  // Event handler when speech recognition ends
  useEffect(() => {
    setSpeechRecCompletedHandler(async (userChatMessage: string) => {
      if (userChatMessage.length > 0) {
        console.log('🎉 Speech Recognition Completed. Recognized Content: ', userChatMessage);
      } else {
        console.log('🎉 Speech Recognition Completed. User spoke nothing. ');
      }
    });
  }, [setSpeechRecCompletedHandler]);

  /** **********************************************************************
   * Start speech recognition when user presses the microphone button
   ********************************************************************** */
  const handleConversationButtonPressed = useCallback(async () => {
    startSpeechRecognition();
  }, [startSpeechRecognition]);

  /** **********************************************************************
   * End speech recognition when user releases the microphone button
   ********************************************************************** */
  const handleConversationButtonReleased = useCallback(() => {
    stopSpeechRecognition();
  }, [stopSpeechRecognition]);

  const speechRecContentArea = useMemo(() => {
    return <Text variant="titleLarge">{speechContentRealTime}</Text>;
  }, [speechContentRealTime]);

  return (
    <SafeAreaView>
      {/* Show realtime speech recognition content in this area */}
      <ScrollView>{speechRecContentArea}</ScrollView>

      <MicrophoneButton
        disabled={false}
        handleButtonPressed={handleConversationButtonPressed}
        handleButtonReleased={handleConversationButtonReleased}
      />
    </SafeAreaView>
  );
});

Example App

For full and advanced usage please take a look into the example React Native app in the example folder.

API

Here are all the APIs provided by the useSpeechRecognition hook.

API	Description	Platform
startSpeechRecognition(localeTag)	Starts listening for speech for a specific locale in IETF BCP 47 standard (e.g., en-US, en-GB, zh-CN, etc). If no localeTag is given, then it uses the system locale of the user's device.	iOS, Android
stopSpeechRecognition	Stops listening for speech.	iOS, Android
cancelSpeechRecognition	Cancels listening for speech.	iOS, Android
speechContentRealTime	A string representing the recognized texts while the user is speaking. You can use this string to show the real time speech recognition text.	iOS, Android
setSpeechRecStartedHandler	Set a callback handler which is called when speech recognition starts without error. The signature of the handler is `async () => void`	iOS, Android
setSpeechRecCompletedHandler	Set a callback handler which is called when speech recognition completes without error. The signature of the handler is `async (speechRecResult: string) => void`, where speechRecResult is a string representing the text recognized from user's speech	iOS, Android
setSpeechRecErrorHandler	Set a callback handler which is called when speech recognition has error. The signature of the handler is `async (errorMessage: string) => void`, where errorMessage explains why the error happens	iOS, Android

License

MIT

Packages used

Greatly thanks to the following libraries which this project depends on:

react-native-localize
react-native-webview
react-native-volume-manager

sunboykenneth / react-native-voicebox-speech-rec

readme