Jiangshan00001 / pyttsx4

Offline Text To Speech synthesis for python
Mozilla Public License 2.0
43 stars 8 forks source link

QA: save to memory error #2

Closed Gary2018X closed 1 year ago

Gary2018X commented 1 year ago

After using the save to bytes method you developed, I wanted to use pydub loading for some post-processing, but it reported an error in format. How can I solve it?

import pyttsx4
from io import BytesIO
from pydub import AudioSegment
from pydub.playback import play

engine = pyttsx4.init()
b = BytesIO()
engine.save_to_file('i am Hello World', b)
engine.runAndWait()

b.seek(0)
audio = AudioSegment.from_file(b, format="wav")
play(audio)
Gary2018X commented 1 year ago
Traceback (most recent call last):
  File "test.py", line 15, in <module>
    audio = AudioSegment.from_file(b)
  File "C:\Users\cjl\AppData\Local\Programs\Python\Python38\lib\site-packages\pydub\audio_segment.py", line 773, in from_file
    raise CouldntDecodeError(
pydub.exceptions.CouldntDecodeError: Decoding failed. ffmpeg returned error code: 1

Output from ffmpeg/avlib:

ffmpeg version N-108625-g28ac2279ad-20221012 Copyright (c) 2000-2022 the FFmpeg developers
  built with gcc 12.1.0 (crosstool-NG 1.25.0.55_3defb7b)
  configuration: --prefix=/ffbuild/prefix --pkg-config-flags=--static --pkg-config=pkg-config --cross-prefix=x86_64-w64-mingw32- --arch=x86_64 --target-os=mingw32 --enable-gpl --enable-version3 
--disable-debug --enable-shared --disable-static --disable-w32threads --enable-pthreads --enable-iconv --enable-libxml2 --enable-zlib --enable-libfreetype --enable-libfribidi --enable-gmp --enable-lzma --enable-fontconfig --enable-libvorbis --enable-opencl --disable-libpulse --enable-libvmaf --disable-libxcb --disable-xlib --enable-amf --enable-libaom --enable-libaribb24 --enable-avisynth --enable-libdav1d --enable-libdavs2 --disable-libfdk-aac --enable-ffnvcodec --enable-cuda-llvm --enable-frei0r --enable-libgme --enable-libkvazaar --enable-libass --enable-libbluray --enable-libjxl --enable-libmp3lame --enable-libopus --enable-librist --enable-libssh --enable-libtheora --enable-libvpx --enable-libwebp --enable-lv2 --enable-libmfx --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenh264 --enable-libopenjpeg --enable-libopenmpt --enable-librav1e --enable-librubberband --enable-schannel --enable-sdl2 --enable-libsoxr --enable-libsrt --enable-libsvtav1 --enable-libtwolame --enable-libuavs3d --disable-libdrm --disable-vaapi --enable-libvidstab --enable-vulkan --enable-libshaderc --enable-libplacebo --enable-libx264 --enable-libx265 
--enable-libxavs2 --enable-libxvid --enable-libzimg --enable-libzvbi --extra-cflags=-DLIBTWOLAME_STATIC --extra-cxxflags= --extra-ldflags=-pthread --extra-ldexeflags= --extra-libs=-lgomp --extra-version=20221012
  libavutil      57. 39.101 / 57. 39.101
  libavcodec     59. 50.100 / 59. 50.100
  libavformat    59. 34.101 / 59. 34.101
  libavdevice    59.  8.101 / 59.  8.101
  libavfilter     8. 49.101 /  8. 49.101
  libswscale      6.  8.112 /  6.  8.112
  libswresample   4.  9.100 /  4.  9.100
  libpostproc    56.  7.100 / 56.  7.100
[cache @ 000001d7a0916f40] Statistics, cache hits:0 cache misses:0
cache:pipe:0: Invalid data found when processing input
Jiangshan00001 commented 1 year ago

b.seek(0) #<<---add this line and have another try audio = AudioSegment.from_file(b, format="wav")

Gary2018X commented 1 year ago

still reporting the same error

Traceback (most recent call last):
  File "test_tttts.py", line 12, in <module>
    audio = AudioSegment.from_file(b, format="wav")
  File "C:\Users\cjl\AppData\Local\Programs\Python\Python38\lib\site-packages\pydub\audio_segment.py", line 773, in from_file
    raise CouldntDecodeError(
pydub.exceptions.CouldntDecodeError: Decoding failed. ffmpeg returned error code: 1

Output from ffmpeg/avlib:

ffmpeg version N-108625-g28ac2279ad-20221012 Copyright (c) 2000-2022 the FFmpeg developers
  built with gcc 12.1.0 (crosstool-NG 1.25.0.55_3defb7b)
  configuration: --prefix=/ffbuild/prefix --pkg-config-flags=--static --pkg-config=pkg-config --cross-prefix=x86_64-w64-mingw32- --arch=x86_64 --target-os=mingw32 --enable-gpl --enable-version3 
--disable-debug --enable-shared --disable-static --disable-w32threads --enable-pthreads --enable-iconv --enable-libxml2 --enable-zlib --enable-libfreetype --enable-libfribidi --enable-gmp --enable-lzma --enable-fontconfig --enable-libvorbis --enable-opencl --disable-libpulse --enable-libvmaf --disable-libxcb --disable-xlib --enable-amf --enable-libaom --enable-libaribb24 --enable-avisynth --enable-libdav1d --enable-libdavs2 --disable-libfdk-aac --enable-ffnvcodec --enable-cuda-llvm --enable-frei0r --enable-libgme --enable-libkvazaar --enable-libass --enable-libbluray --enable-libjxl --enable-libmp3lame --enable-libopus --enable-librist --enable-libssh --enable-libtheora --enable-libvpx --enable-libwebp --enable-lv2 --enable-libmfx --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenh264 --enable-libopenjpeg --enable-libopenmpt --enable-librav1e --enable-librubberband --enable-schannel --enable-sdl2 --enable-libsoxr --enable-libsrt --enable-libsvtav1 --enable-libtwolame --enable-libuavs3d --disable-libdrm --disable-vaapi --enable-libvidstab --enable-vulkan --enable-libshaderc --enable-libplacebo --enable-libx264 --enable-libx265 
--enable-libxavs2 --enable-libxvid --enable-libzimg --enable-libzvbi --extra-cflags=-DLIBTWOLAME_STATIC --extra-cxxflags= --extra-ldflags=-pthread --extra-ldexeflags= --extra-libs=-lgomp --extra-version=20221012
  libavutil      57. 39.101 / 57. 39.101
  libavcodec     59. 50.100 / 59. 50.100
  libavformat    59. 34.101 / 59. 34.101
  libavdevice    59.  8.101 / 59.  8.101
  libavfilter     8. 49.101 /  8. 49.101
  libswscale      6.  8.112 /  6.  8.112
  libswresample   4.  9.100 /  4.  9.100
  libpostproc    56.  7.100 / 56.  7.100
[wav @ 00000190eeee6bc0] invalid start code [0][0][0][0] in RIFF header
[cache @ 00000190eeee70c0] Statistics, cache hits:0 cache misses:1
cache:pipe:0: Invalid data found when processing input
Jiangshan00001 commented 1 year ago

the problem is that the memory stream has no wav header. i have not think it clear that we need a memory with header or not. so , for the current solution, we can add the wav header for your problem like that below.

@Gary2018X

import pyttsx4
from io import BytesIO
from pydub import AudioSegment
from pydub.playback import play
import os
import sys

engine = pyttsx4.init()
b = BytesIO()
engine.save_to_file('i am Hello World', b)
engine.runAndWait()
#the bs is raw data of the audio.
bs=b.getvalue()
# add an wav file format header
b=bytes(b'RIFF')+ (len(bs)+38).to_bytes(4, byteorder='little')+b'WAVEfmt\x20\x12\x00\x00' \
                                                               b'\x00\x01\x00\x01\x00' \
                                                               b'\x22\x56\x00\x00\x44\xac\x00\x00' +\
    b'\x02\x00\x10\x00\x00\x00data' +(len(bs)).to_bytes(4, byteorder='little')+bs
# changed to BytesIO
b=BytesIO(b)
audio = AudioSegment.from_file(b, format="wav")
play(audio)

sys.exit(0)
egorgam commented 1 year ago

looks like this feature works only with sapi5 engine (on Windows machines)

Jiangshan00001 commented 1 year ago

@egorgam yes you are right. it currently support only sapi5 engine. which engine do you want to add this feature?

Gary2018X commented 1 year ago

Thanks.

Jiangshan00001 commented 1 year ago

@egorgam make a new feature request issue to describe you problem if you want another engine support.

Jiangshan00001 commented 1 year ago

@egorgam espeak engine is supposed to be supported by the current version(3.0.5). but it is not tested. you can try if you like