csdcorp / speech_to_text

A Flutter plugin that exposes device specific text to speech recognition capability.
BSD 3-Clause "New" or "Revised" License
348 stars 217 forks source link

[feat] handle audio session when start listening #487

Closed wamynobe closed 1 month ago

wamynobe commented 2 months ago

I am facing an issue when using this library with some packages like flutter_video and webview. When I tried to play video with audio and using speech to text at the same time, speech_to_text's audio session will be prioritized and the video will be paused. I tried to add .mixWithOthers and it worked as I expected. The same with Android side. I will submit a pr for this, please check it out. Thank you!

wamynobe commented 2 months ago

relate to #472

wamynobe commented 2 months ago
import 'dart:async';

import 'package:flutter/material.dart';
import 'package:speech_to_text/speech_recognition_result.dart';
import 'package:speech_to_text/speech_to_text.dart';
import 'package:youtube_player_flutter/youtube_player_flutter.dart';

void main() => runApp(const SpeechSampleApp());

class SpeechSampleApp extends StatefulWidget {
  const SpeechSampleApp({Key? key}) : super(key: key);

  @override
  State<SpeechSampleApp> createState() => _SpeechSampleAppState();
}

class _SpeechSampleAppState extends State<SpeechSampleApp> {
  final YoutubePlayerController _controller = YoutubePlayerController(
    initialVideoId: 'iLnmTe5Q2Qw',
  );
  final SpeechToText speech = SpeechToText();
  String text = '';
  @override
  void initState() {
    super.initState();
    initSpeechState();
  }

  Future<void> initSpeechState() async {
    try {
      await speech.initialize();
    } catch (_) {}
  }

  @override
  Widget build(BuildContext context) {
    return MaterialApp(
      home: Scaffold(
        appBar: AppBar(
          title: const Text('Speech to Text Example'),
        ),
        body: Column(children: [
          const HeaderWidget(),
          Column(
            children: <Widget>[
              YoutubePlayer(
                controller: _controller,
                showVideoProgressIndicator: true,
                progressIndicatorColor: Colors.amber,
                progressColors: const ProgressBarColors(
                  playedColor: Colors.amber,
                  handleColor: Colors.amberAccent,
                ),
                onReady: () {},
              ),
            ],
          ),
          const SizedBox(height: 20),
          ElevatedButton(
              onPressed: () {
                startListening();
              },
              child: const Text('Start')),
          ElevatedButton(
              onPressed: () {
                stopListening();
              },
              child: const Text('Stop')),
          Text('Recognized Words: $text'),
          SpeechStatusWidget(speech: speech),
        ]),
      ),
    );
  }

  void startListening() {
    final options = SpeechListenOptions(
        listenMode: ListenMode.confirmation,
        cancelOnError: true,
        partialResults: true,
        autoPunctuation: true,
        enableHapticFeedback: true);

    speech.listen(
      onResult: resultListener,
      listenOptions: options,
    );
    setState(() {});
  }

  void stopListening() {
    speech.stop();
  }

  /// This callback is invoked each time new recognition results are
  /// available after `listen` is called.
  void resultListener(SpeechRecognitionResult result) {
    text = result.recognizedWords;
    setState(() {});
  }
}

/// Displays the most recently recognized words and the sound level.
class RecognitionResultsWidget extends StatelessWidget {
  const RecognitionResultsWidget({
    Key? key,
    required this.lastWords,
    required this.level,
  }) : super(key: key);

  final String lastWords;
  final double level;

  @override
  Widget build(BuildContext context) {
    return Column(
      children: <Widget>[
        const Center(
          child: Text(
            'Recognized Words',
            style: TextStyle(fontSize: 22.0),
          ),
        ),
        Expanded(
          child: Stack(
            children: <Widget>[
              Container(
                color: Theme.of(context).secondaryHeaderColor,
                child: Center(
                  child: Text(
                    lastWords,
                    textAlign: TextAlign.center,
                  ),
                ),
              ),
              Positioned.fill(
                bottom: 10,
                child: Align(
                  alignment: Alignment.bottomCenter,
                  child: Container(
                    width: 40,
                    height: 40,
                    alignment: Alignment.center,
                    decoration: BoxDecoration(
                      boxShadow: [
                        BoxShadow(
                            blurRadius: .26,
                            spreadRadius: level * 1.5,
                            color: Colors.black.withOpacity(.05))
                      ],
                      color: Colors.white,
                      borderRadius: const BorderRadius.all(Radius.circular(50)),
                    ),
                    child: IconButton(
                      icon: const Icon(Icons.mic),
                      onPressed: () {},
                    ),
                  ),
                ),
              ),
            ],
          ),
        ),
      ],
    );
  }
}

class HeaderWidget extends StatelessWidget {
  const HeaderWidget({
    Key? key,
  }) : super(key: key);

  @override
  Widget build(BuildContext context) {
    return const Center(
      child: Text(
        'Speech recognition available',
        style: TextStyle(fontSize: 22.0),
      ),
    );
  }
}

/// Display the current status of the listener
class SpeechStatusWidget extends StatelessWidget {
  const SpeechStatusWidget({
    Key? key,
    required this.speech,
  }) : super(key: key);

  final SpeechToText speech;

  @override
  Widget build(BuildContext context) {
    return Container(
      padding: const EdgeInsets.symmetric(vertical: 20),
      color: Theme.of(context).colorScheme.background,
      child: Center(
        child: speech.isListening
            ? const Text(
                "I'm listening...",
                style: TextStyle(fontWeight: FontWeight.bold),
              )
            : const Text(
                'Not listening',
                style: TextStyle(fontWeight: FontWeight.bold),
              ),
      ),
    );
  }
}

You can try this example as well

wamynobe commented 2 months ago

Should we keep this issue open or create a new one to start some discussions about android side?

wamynobe commented 2 months ago

I found this in the implementation of web view on android

@Override
    public void onAudioFocusChange(int focusChange) {
        switch (focusChange) {
        case AudioManager.AUDIOFOCUS_GAIN:
            // resume playback
            if (mMediaPlayer == null) {
                resetMediaPlayer();
            } else if (mState != ERROR && !mMediaPlayer.isPlaying()) {
                mMediaPlayer.start();
                mState = STARTED;
            }
            break;
        case AudioManager.AUDIOFOCUS_LOSS:
            // Lost focus for an unbounded amount of time: stop playback.
            if (mState != ERROR && mMediaPlayer.isPlaying()) {
                mMediaPlayer.stop();
                mState = STOPPED;
            }
            break;
        case AudioManager.AUDIOFOCUS_LOSS_TRANSIENT:
        case AudioManager.AUDIOFOCUS_LOSS_TRANSIENT_CAN_DUCK:
            // Lost focus for a short time, but we have to stop
            // playback.
            if (mState != ERROR && mMediaPlayer.isPlaying()) pause();
            break;
        }
    }

look like it stops the player when losing focus on audio even with duck option

Edit: This is an old ref in their git.

wamynobe commented 2 months ago

I couldn't find any thing related to this file in the latest ref. OMG!!!!

sowens-csd commented 2 months ago

I think we should keep this open for discussion on the Android options, the context is useful.

wamynobe commented 1 month ago

Due to limitations of webkit on Android, audio cannot currently be "mixed" when playing video on webview and speech to text. If you are using video_player you can add option VideoPlayerOptions.mixWithOthers to make speech to text and your video player play at the same time. I'll close issue and further information can be found here link