pschatzmann / arduino-audio-tools

Arduino Audio Tools (a powerful Audio library not only for Arduino)
GNU General Public License v3.0
1.42k stars 222 forks source link

Unable to run examples on ESP32 WROOM 32D with MAX98357A #1666

Closed RespawnDespair closed 3 days ago

RespawnDespair commented 4 weeks ago

Problem Description

After experiencing this issue with the Azure TTS example I tried to run the most basic example which I believe is the streams-generator-i2s example. Luckily this shows the same issue.

I have the MAX98357A connected to other pins than usual, because it is part of a bigger project. The device is functional and works with another Library (ESP32-audioI2S https://github.com/schreibfaul1/ESP32-audioI2S).

There I have defined the pins as follows:

#define I2S_DOUT  14
#define I2S_BCLK  26
#define I2S_LRC   27

Below I have attached the full sketch, where i tried to define the alternative pins as well.

When running the code it throws an error writing to the I2S stream as shown in this log extract:

rst:0x1 (POWERON_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:1
load:0x3fff0030,len:1448
load:0x40078000,len:14844
ho 0 tail 12 room 4
load:0x40080400,len:4
load:0x40080404,len:3356
entry 0x4008059c
starting I2S...
[I] AudioTypes.h : 128 - out: sample_rate: 44100 / channels: 2 / bits_per_sample: 16
[I] AudioTypes.h : 128 -  sample_rate: 44100 / channels: 2 / bits_per_sample: 16
[I] I2SConfigESP32V1.h : 73 - rx/tx mode: TX_MODE
[I] I2SConfigESP32V1.h : 74 - port_no: 0
[I] I2SConfigESP32V1.h : 75 - is_master: Master
[I] I2SConfigESP32V1.h : 76 - sample rate: 44100
[I] I2SConfigESP32V1.h : 77 - bits per sample: 16
[I] I2SConfigESP32V1.h : 78 - number of channels: 2
[I] I2SConfigESP32V1.h : 79 - signal_type: Digital
[I] I2SConfigESP32V1.h : 81 - i2s_format: I2S_STD_FORMAT
[I] I2SConfigESP32V1.h : 84 - use_apll: true
[I] I2SConfigESP32V1.h : 92 - pin_bck: 26
[I] I2SConfigESP32V1.h : 94 - pin_ws: 27
[I] I2SConfigESP32V1.h : 96 - pin_data: 14
[I] I2SESP32V1.h : 214 - tx: 14, rx: -1
[I] SoundGenerator.h : 164 - SineWaveGenerator::begin(channels=2, sample_rate=44100, frequency=493.88)
[I] SoundGenerator.h : 149 - bool audio_tools::SineWaveGenerator<T>::begin() [with T = short int]
[I] AudioTypes.h : 128 - SoundGenerator: sample_rate: 44100 / channels: 2 / bits_per_sample: 16
[I] Buffers.h : 372 - resize: 4
[I] SoundGenerator.h : 192 - setFrequency: 493.88
[I] SoundGenerator.h : 193 - active: true
started...
[I] StreamCopy.h : 158 - StreamCopy::copy  1024 -> 1024 -> 1024 bytes - in 1 hops
[I] StreamCopy.h : 158 - StreamCopy::copy  1024 -> 1024 -> 1024 bytes - in 1 hops
[I] StreamCopy.h : 158 - StreamCopy::copy  1024 -> 1024 -> 1024 bytes - in 1 hops
[I] StreamCopy.h : 158 - StreamCopy::copy  1024 -> 1024 -> 1024 bytes - in 1 hops
[I] StreamCopy.h : 158 - StreamCopy::copy  1024 -> 1024 -> 1024 bytes - in 1 hops
...

I have read through the solved issues, but I could find no similar issue, which also leads me to believe I have made an error somewhere. I would appreciate any feedback or suggestions. I would love to use this library for my project as it feels the most clean and well coded to me.

Device Description

A Generic ESP32 WROOM 32D device setup in Arduino as follows: ESP32 Dev Module Partition Scheme: Huge App PSRAM: Disabled

Sketch

/**
 * @file streams-generator-i2s.ino
 * @author Phil Schatzmann
 * @brief see https://github.com/pschatzmann/arduino-audio-tools/blob/main/examples/examples-stream/streams-generator-i2s/README.md 
 * @copyright GPLv3
 */

#include "AudioTools.h"

AudioInfo info(44100, 2, 16);
SineWaveGenerator<int16_t> sineWave(32000);                // subclass of SoundGenerator with max amplitude of 32000
GeneratedSoundStream<int16_t> sound(sineWave);             // Stream generated from sine wave
I2SStream out; 
StreamCopy copier(out, sound);                             // copies sound into i2s

// Arduino Setup
void setup(void) {  
  // Open Serial 
  Serial.begin(115200);
  while(!Serial);
  AudioLogger::instance().begin(Serial, AudioLogger::Info);

  // start I2S
  Serial.println("starting I2S...");
  auto config = out.defaultConfig(TX_MODE);
  config.copyFrom(info); 
  config.pin_ws = GPIO_NUM_27;            //LCK
  config.pin_bck = GPIO_NUM_26;           //BCK
  config.pin_data = GPIO_NUM_14;          //DIN
  out.begin(config);

  // Setup sine wave
  sineWave.begin(info, N_B4);
  Serial.println("started...");
}

// Arduino loop - copy sound to out 
void loop() {
  copier.copy();
}

Other Steps to Reproduce

No response

What is your development environment

Arduino IDE on MacOS

I have checked existing issues, discussions and online documentation

RespawnDespair commented 4 weeks ago

Well, that's embarassing. The log actually shows no errors and upon reconnecting the ESP it actually does give output. It sounds a bit garbled, but sound is coming out.

I will go back to the Azure TTS sketch, which in fact did throw log errors...

RespawnDespair commented 4 weeks ago

Well, still no luck, I keep getting these errors when trying to use the Azure TTS example:

[I] StreamCopy.h : 158 - StreamCopy::copy  432 -> 432 -> 0 bytes - in 22 hops
[I] StreamCopy.h : 408 - try write  - 2 (open 432 bytes) 
[I] StreamCopy.h : 408 - try write  - 3 (open 432 bytes) 
[I] StreamCopy.h : 408 - try write  - 4 (open 432 bytes) 
[I] StreamCopy.h : 408 - try write  - 5 (open 432 bytes) 
[I] StreamCopy.h : 408 - try write  - 6 (open 432 bytes) 
[I] StreamCopy.h : 408 - try write  - 7 (open 432 bytes) 
[I] StreamCopy.h : 408 - try write  - 8 (open 432 bytes) 
[I] StreamCopy.h : 408 - try write  - 9 (open 432 bytes) 
[I] StreamCopy.h : 408 - try write  - 10 (open 432 bytes) 
[I] StreamCopy.h : 408 - try write  - 11 (open 432 bytes) 
[I] StreamCopy.h : 408 - try write  - 12 (open 432 bytes) 
[I] StreamCopy.h : 408 - try write  - 13 (open 432 bytes) 
[I] StreamCopy.h : 408 - try write  - 14 (open 432 bytes) 
[I] StreamCopy.h : 408 - try write  - 15 (open 432 bytes) 
[I] StreamCopy.h : 408 - try write  - 16 (open 432 bytes) 
[I] StreamCopy.h : 408 - try write  - 17 (open 432 bytes) 
[I] StreamCopy.h : 408 - try write  - 18 (open 432 bytes) 
[I] StreamCopy.h : 408 - try write  - 19 (open 432 bytes) 
[I] StreamCopy.h : 408 - try write  - 20 (open 432 bytes) 
[I] StreamCopy.h : 408 - try write  - 21 (open 432 bytes) 
[E] StreamCopy.h : 401 - write  to target has failed after 22 retries! (432 bytes)
[I] StreamCopy.h : 158 - StreamCopy::copy  432 -> 432 -> 0 bytes - in 22 hops
[I] StreamCopy.h : 408 - try write  - 2 (open 432 bytes) 
[I] StreamCopy.h : 408 - try write  - 3 (open 432 bytes) 

I have tried different Output formats and codec since it seems the originall riff-16khz format used in the example is no longer available. (https://learn.microsoft.com/en-us/azure/ai-services/speech-service/rest-text-to-speech?tabs=streaming#audio-outputs)

So I am sure it is still me making a stupid mistake, but any guidance would be highly appreciated...

pschatzmann commented 4 weeks ago

That should be pretty trivial to fix: Just take one of the RAW formats and remove the decoder!

If you write invalid data to the WAV decoder, this is bound to fail...

RespawnDespair commented 4 weeks ago

Yes, using a raw output format without a decoder indeed works. This is good news, it means the hardware is all good and the library works, thanks for this.

I still think the example is broken because the requested riff output format no longer appears on the list provided by Microsoft? If I use the WAV decoder and the output format specified in the example I get the errors mentioned in the previous log. I will try some additional combinations with the encoder, but I have a working reference now.

Again many thanks for the library and the help!

RespawnDespair commented 4 weeks ago

My quick tests did not result in a working OutputFormat and decoder combination. Changing the requested OutputFormat in the example to riff-8khz-16bit-mono-pcm and changing the sample_rate to 8000 also results in the errors above.

Since the riff- formats only appear on the non-streaming tab of the Microsoft page I get the feeling they are not available and it defaults to a different format?

If the example works for everyone else it might still be a thing on my end, but it does work in raw without a decoder, so I am still confused.

pschatzmann commented 4 weeks ago

If you have a WAV file, changing the AudioInfo does not have any impact since it is finally take from the header. Did you try to get the file with curl on your desktop in order to analyze it?

Please share your working sketch, so that I can adjust the example accordingly.

RespawnDespair commented 4 weeks ago

I have altered the example a little since I use Dutch, but in essence all I did is remove the WAVE Decoder and modify the requested OutputFormat in the header to RAW as you suggested.

When using the original riff-16khz-16bit-mono-pcm it does not work with the WAVE Decoder as setup in the example.

I have the file on my PC, but I am not experienced enough with WAVE files to know what I should look for.

....
AudioInfo info(16000, 1, 16);
...
I2SStream i2s;                          // or I2SStream 
StreamCopy copier(i2s, AzureURLStream); // copy in to out

void setup(){
  Serial.begin(115200);  
  AudioLogger::instance().begin(Serial, AudioLogger::Info);  

  // setup i2s
  auto config = i2s.defaultConfig(TX_MODE);
  config.copyFrom(info); 
  config.pin_ws = GPIO_NUM_27;            //LCK
  config.pin_bck = GPIO_NUM_26;           //BCK
  config.pin_data = GPIO_NUM_14;          //DIN
  i2s.begin(config);

  ....
  String ssml = "<speak version='1.0' xml:lang='" + language + "'><voice xml:lang='" + language + "' xml:gender='" + gender + "' name='" + voice + "'>" + msg + "</voice></speak>";
  AzureURLStream.addRequestHeader("Ocp-Apim-Subscription-Key", speechKey.c_str());
  AzureURLStream.addRequestHeader("X-Microsoft-OutputFormat", "raw-16khz-16bit-mono-pcm");   // if you change this, change the settings for i2s and the decoder
  ...
}

void loop(){
  copier.copy();
}
pschatzmann commented 4 weeks ago

Just look at the first 4 characters of the file: it should be RIFF

pschatzmann commented 3 days ago

Closed due to inactivity