pschatzmann / arduino-espeak-ng

eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.
GNU General Public License v3.0
25 stars 4 forks source link

Skipping and Jumping Speech #9

Closed eroboticdude closed 11 months ago

eroboticdude commented 11 months ago

Hello, I am experimenting with this library, and using a i2s dac and the default test example on a esp32-s2 is skipping and jumping around, as well as being quite sped up. here is my code:

#include "AudioTools.h" // https://github.com/pschatzmann/arduino-audio-tools
//#include "AudioLibs/AudioKit.h" // https://github.com/pschatzmann/arduino-audiokit
#include "FileSystems.h" // https://github.com/pschatzmann/arduino-posix-fs
#include "espeak.h"

I2SStream i2s; // or replace with AudioKitStream for AudioKit
ESpeak espeak(i2s);

void setup() {
  Serial.begin(115200);
  //file_systems::FSLogger.begin(file_systems::FSInfo, Serial); 
  // add voice option
  heap_caps_malloc_extmem_enable(1000000);
  espeak.add("/mem/data/voices/!v/announcer", espeak_ng_data_voices__v_announcer, espeak_ng_data_voices__v_announcer_len);

  // setup espeak
  espeak.begin();
  // Set voice and voice option
  espeak.setVoice("en+announcer");

  // setup output
  audio_info espeak_info = espeak_get_audio_info();
  auto cfg = i2s.defaultConfig(TX_MODE);
  cfg.channels = 2; //espeak_info.channels; // 1
  cfg.sample_rate = 14000;//espeak_info.sample_rate; // 22050
  cfg.bits_per_sample = espeak_info.bits_per_sample; // 16
  cfg.pin_bck = 18;
  cfg.pin_ws = 33;
  cfg.pin_data = 16;
  i2s.begin(cfg);

}
String line;

void loop() {
  if (Serial.available() > 0) {
    line = Serial.readStringUntil('\n');
      Serial.println(line);
      espeak.say(line.c_str());
    }
  }
eroboticdude commented 11 months ago

note that I tested it with the voice set to alicia ( the default)

eroboticdude commented 11 months ago

How do I properly change the voice? The documentation is very hard to grasp. I assumed I should just change the text but this does not seem to work.

pschatzmann commented 11 months ago

I can't really add anything on top of what's already in the wiki: It's a two stop process: first you need to make the configuration files available (with add) before you can set the voice

eroboticdude commented 11 months ago

Thanks, I've been figuring this out. It turns out when you set the output to 2 channels, the result is the sped up behavior. For voices, the announcer voice which I would like to try is broken, and it seems it can only pronounce "k" and "t" sounds. otherwise, everything is working better after switching to single channel.

pschatzmann commented 11 months ago

Yes, I can confirm that espeak is generating a mono signal.

I didn't really have a close look into voices and maybe it is the easiest to check this out on the desktop to figure out what combinations are working...

eroboticdude commented 11 months ago

issues were most likely caused by the pre-release version of the ESP32-S2 drivers 3.0.0-2