festvox / flite

A small fast portable speech synthesis system
Other
861 stars 186 forks source link

16khz output from indic voice #42

Open MittalShruti opened 4 years ago

MittalShruti commented 4 years ago

Hello, I am using indic voice to generate the audio

./flite/bin/flite "-voice" flite/voices/cmu_indic_hin_ab.flitevox 'पुत्र मित्र आदि सगे संबंधियों' "-o" 'try.wav'

The output file try.wav is always 16khz. However, in the README.md it was mentioned that the output is deliberately kept at 8khz. Is it not valid for non-us voices?

mawillcockson commented 4 years ago

That is strange.

It appears the included slt voice also produces a 16kHz WAV file:

Audacity spectrogram

rsingh2083 commented 3 years ago

Hello, I am using indic voice to generate the audio

./flite/bin/flite "-voice" flite/voices/cmu_indic_hin_ab.flitevox 'पुत्र मित्र आदि सगे संबंधियों' "-o" 'try.wav'

The output file try.wav is always 16khz. However, in the README.md it was mentioned that the output is deliberately kept at 8khz. Is it not valid for non-us voices?

Hi Shruti,

Can you share the cmu_indic_hin_ab.flitevox file please ?

mawillcockson commented 3 years ago

I'm not the original issue reporter, but I believe they used the voices provided on the associated website, specifically:

http://www.festvox.org/flite/packed/flite-2.1/voices/cmu_indic_hin_ab.flitevox

I was able to recreate the issue:

flite -voice "http://www.festvox.org/flite/packed/flite-2.1/voices/cmu_indic_hin_ab.flitevox" 'पुत्र मित्र आदि सगे संबंधियों' -o try.wav

This file appears to have a 16KHz sample rate:

aplay --samples=1 --verbose try.wav

which gives

Playing WAVE 'try.wav' : Signed 16 bit Little Endian, Rate 16000 Hz, Mono
ALSA <-> PulseAudio PCM I/O Plugin
Its setup is:
  stream       : PLAYBACK
  access       : RW_INTERLEAVED
  format       : S16_LE
  subformat    : STD
  channels     : 1
  rate         : 16000
  exact rate   : 16000 (16000/1)
  msbits       : 16
  buffer_size  : 8000
  period_size  : 2000
  period_time  : 125000
  tstamp_mode  : NONE
  tstamp_type  : GETTIMEOFDAY
  period_step  : 1
  avail_min    : 2000
  period_event : 0
  start_threshold  : 8000
  stop_threshold   : 8000
  silence_threshold: 0
  silence_size : 0
  boundary     : 9007199254740992000