pipecat-ai / pipecat

Open Source framework for voice and multimodal conversational AI
BSD 2-Clause "Simplified" License
3.37k stars 322 forks source link

Intermittent error in SSTService in _append_audio method #548

Open nitin-sharpsell opened 1 month ago

nitin-sharpsell commented 1 month ago

Getting this error randomly while inheriting the SSTService.

2024-10-04 21:10:50.711 | ERROR    | pipecat.processors.frame_processor:push_frame:203 - Uncaught exception in DailyInputTransport#0: 'NoneType' object has no attribute 'write'
Traceback (most recent call last):

  File "<string>", line 1, in <module>
  File "/Users/user/.pyenv/versions/3.10.14/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
               │     │   └ 5
               │     └ 8
               └ <function _main at 0x105ef4820>
  File "/Users/user/.pyenv/versions/3.10.14/lib/python3.10/multiprocessing/spawn.py", line 129, in _main
    return self._bootstrap(parent_sentinel)
           │    │          └ 5
           │    └ <function BaseProcess._bootstrap at 0x105e8fa30>
           └ <SpawnProcess name='SpawnProcess-1' parent=25576 started>
  File "/Users/user/.pyenv/versions/3.10.14/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
    │    └ <function BaseProcess.run at 0x105e8f0a0>
    └ <SpawnProcess name='SpawnProcess-1' parent=25576 started>
  File "/Users/user/.pyenv/versions/3.10.14/lib/python3.10/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
    │    │        │    │        │    └ {'config': <uvicorn.config.Config object at 0x105ea9f90>, 'target': <bound method Server.run of <uvicorn.server.Server object...
    │    │        │    │        └ <SpawnProcess name='SpawnProcess-1' parent=25576 started>
    │    │        │    └ ()
    │    │        └ <SpawnProcess name='SpawnProcess-1' parent=25576 started>
    │    └ <function subprocess_started at 0x106233910>
    └ <SpawnProcess name='SpawnProcess-1' parent=25576 started>
  File "/Users/user/Documents/sharpsell/git/realtime_call/.venv/lib/python3.10/site-packages/uvicorn/_subprocess.py", line 80, in subprocess_started
    target(sockets=sockets)
    │              └ [<socket.socket fd=4, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('127.0.0.1', 8000)>]
    └ <bound method Server.run of <uvicorn.server.Server object at 0x105eab6a0>>
  File "/Users/user/Documents/sharpsell/git/realtime_call/.venv/lib/python3.10/site-packages/uvicorn/server.py", line 65, in run
    return asyncio.run(self.serve(sockets=sockets))
           │       │   │    │             └ [<socket.socket fd=4, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('127.0.0.1', 8000)>]
           │       │   │    └ <function Server.serve at 0x1061fa050>
           │       │   └ <uvicorn.server.Server object at 0x105eab6a0>
           │       └ <function run at 0x105fe5bd0>
           └ <module 'asyncio' from '/Users/user/.pyenv/versions/3.10.14/lib/python3.10/asyncio/__init__.py'>
  File "/Users/user/.pyenv/versions/3.10.14/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
           │    │                  └ <coroutine object Server.serve at 0x10639b450>
           │    └ <method 'run_until_complete' of 'uvloop.loop.Loop' objects>
           └ <uvloop.Loop running=True closed=False debug=False>
  File "/Users/user/Documents/sharpsell/git/realtime_call/.venv/lib/python3.10/site-packages/pipecat/transports/base_input.py", line 134, in _push_frame_task_handler
    await self.push_frame(frame, direction)
          │    │          │      └ <FrameDirection.DOWNSTREAM: 1>
          │    │          └ AudioRawFrame(id=6253, name='AudioRawFrame#5046', audio=b'P\xffX\xffW\xffU\xffX\xffF\xff?\xffO\xffp\xff\xa3\xff\xd2\xff\xf9\x...
          │    └ <function FrameProcessor.push_frame at 0x11d45c1f0>
          └ <pipecat.transports.services.daily.DailyInputTransport object at 0x14234dd20>
> File "/Users/user/Documents/sharpsell/git/realtime_call/.venv/lib/python3.10/site-packages/pipecat/processors/frame_processor.py", line 198, in push_frame
    await self._next.process_frame(frame, direction)
          │    │     │             │      └ <FrameDirection.DOWNSTREAM: 1>
          │    │     │             └ AudioRawFrame(id=6253, name='AudioRawFrame#5046', audio=b'P\xffX\xffW\xffU\xffX\xffF\xff?\xffO\xffp\xff\xa3\xff\xd2\xff\xf9\x...
          │    │     └ <function STTService.process_frame at 0x12220c550>
          │    └ <src.audio_call.stt.sharpsell_stt_service.SharpsellSTTService object at 0x122ca74f0>
          └ <pipecat.transports.services.daily.DailyInputTransport object at 0x14234dd20>
  File "/Users/user/Documents/sharpsell/git/realtime_call/.venv/lib/python3.10/site-packages/pipecat/services/ai_services.py", line 309, in process_frame
    await self._append_audio(frame)
          │    │             └ AudioRawFrame(id=6253, name='AudioRawFrame#5046', audio=b'P\xffX\xffW\xffU\xffX\xffF\xff?\xffO\xffp\xff\xa3\xff\xd2\xff\xf9\x...
          │    └ <function STTService._append_audio at 0x12220c3a0>
          └ <src.audio_call.stt.sharpsell_stt_service.SharpsellSTTService object at 0x122ca74f0>
  File "/Users/user/Documents/sharpsell/git/realtime_call/.venv/lib/python3.10/site-packages/pipecat/services/ai_services.py", line 276, in _append_audio
    self._wave.writeframes(frame.audio)
    │    │     │           │     └ b'P\xffX\xffW\xffU\xffX\xffF\xff?\xffO\xffp\xff\xa3\xff\xd2\xff\xf9\xff(\x00d\x00\x9b\x00\xd8\x00\xf1\x00\x11\x01#\x01\x1f\x0...
    │    │     │           └ AudioRawFrame(id=6253, name='AudioRawFrame#5046', audio=b'P\xffX\xffW\xffU\xffX\xffF\xff?\xffO\xffp\xff\xa3\xff\xd2\xff\xf9\x...
    │    │     └ <function Wave_write.writeframes at 0x12139cdc0>
    │    └ <wave.Wave_write object at 0x14268a4d0>
    └ <src.audio_call.stt.sharpsell_stt_service.SharpsellSTTService object at 0x122ca74f0>
  File "/Users/user/.pyenv/versions/3.10.14/lib/python3.10/wave.py", line 437, in writeframes
    self.writeframesraw(data)
    │    │              └ b'P\xffX\xffW\xffU\xffX\xffF\xff?\xffO\xffp\xff\xa3\xff\xd2\xff\xf9\xff(\x00d\x00\x9b\x00\xd8\x00\xf1\x00\x11\x01#\x01\x1f\x0...
    │    └ <function Wave_write.writeframesraw at 0x12139cd30>
    └ <wave.Wave_write object at 0x14268a4d0>
  File "/Users/user/.pyenv/versions/3.10.14/lib/python3.10/wave.py", line 432, in writeframesraw
    self._file.write(data)
    │    │           └ b'P\xffX\xffW\xffU\xffX\xffF\xff?\xffO\xffp\xff\xa3\xff\xd2\xff\xf9\xff(\x00d\x00\x9b\x00\xd8\x00\xf1\x00\x11\x01#\x01\x1f\x0...
    │    └ None
    └ <wave.Wave_write object at 0x14268a4d0>

AttributeError: 'NoneType' object has no attribute 'write'

In my CustomSSTService I am saving the audio bytes in AWS S3 and then calling an API which accepts an mp3 file and returns transcript text.

async def run_stt(self, audio: bytes) -> AsyncGenerator[Frame, None]:
    try:
        # Check if the audio is empty
        if not audio:
            yield ErrorFrame("Audio is empty")
            return
        await self.start_ttfb_metrics()
        file_path = f"real_time_call/{int(time.time())}.mp3"
        self.upload_audio_to_s3(audio_data=audio, file_path=file_path)
        audio_url = f"https:<s3_url>/{file_path}"
        transcript = await self.call_transcript_api(audio_url=audio_url)
        await self.stop_ttfb_metrics()
        if transcript:
            logger.debug(f"Generated transcript from STT service: [{transcript}]")
            yield TranscriptionFrame(transcript, "", time_now_iso8601())
    except Exception as e:
        yield ErrorFrame("STT Service failed")
        return

I am getting the error in this part of code https://github.com/pipecat-ai/pipecat/blob/0c46b3e481b218264ba7ee17a75f7f165cb30d1a/src/pipecat/services/ai_services.py#L276

This internally writes the audio data in a file object but looks like the file is None in this case. This happens randomly sometimes and I am not able to figure out the issue, could someone please help me with this?

sslankesh commented 1 month ago

I am also facing same issue.

The API I am using for speech to text conversion does not support streaming response.