earlephilhower / ESP8266SAM

Speech synthesis for ESP8266 using S.A.M. port
306 stars 40 forks source link

foreground buzzing noise #2

Closed sfranzyshen closed 6 years ago

sfranzyshen commented 6 years ago

I am using the esp8266 (12) with NoDAC and get a foreground noise that renders the output (almost) unusable ... If I didn't know what it was saying ... I wouldn't ... I noticed that the piano sample in the esp8266audio library sounded the same way until I normalized, and amplified the mp3 ... any chance of making tweaks to the esp8266sam's output? I plan to test it with an i2c hardware decoder and amp to see if it's just the quality or the NoDAC setup ...

earlephilhower commented 6 years ago

I think you have some HW issues in the 1-T amplifier stage. If the standard ESP8266Audio samples don't sound about FM radio quality, then SAM's not going to be better. It's a speech synthesizer from 1979 for 8-bit cores running at 900kHZ, so it's not going to be confused with anything modern. :)

Make sure you've got a clean 5V supply (large and small decoupling caps help), NPN is oriented right (I've done it backwards more than once) and the speaker is 4-8ohm (I've only run w/8ohm), not 32ohm like some smaller "buzzer-type" I've seen.

MochiLata commented 6 years ago

Hi, just wanted to chime in that I have the same problem. I hear a foreground noise that somewhat sounds like beeping. I can hear the voice synthesis, along with the beeping-ish noise. I've tried both a tiny Piezo Speaker, and also a $10 computer speaker, and both have problems. Strangely, using ESP8266Audio's example, I can play the violin Wav fine onto the speaker, with no distortion. I've tried both directly driving the speaker from the I2S port, and also using a simple transistor amplifier with the 5V -> (+)Speaker(-) -> Transistor Collector. Would a low pass filter help in this?

Martin-Laclaustra commented 6 years ago

@earlephilhower Thanks for your efforts (I was pursuing the same goal when you published your success). This is not my recording, but here the buzzing noise can be heard: https://www.youtube.com/watch?v=ya2n16q5gfU I confirm that the compiled original project does produce clearer sounds on AMD64 cpus and that the violin sample sounds without any distortion or overheard noise on ESP8266. Pending: generating a synthesized sample to the flash memory and: 1. recovering it and playing it on the pc; 2. playing it from the memory on the ESP8266; 3. comparing it to the direct synthetic output. This simple test will allow deciding whether the noise appears in the synthesis process or in the output phase (I suspect the later, maybe it is outputting it as I2S sound, despite configuring it for NoDAC). I can not commit to a deadline to perform these tests. Maybe they are trivial enough for you to check those in a breeze.

earlephilhower commented 6 years ago

@Martin-Laclaustra , that's actually how I debugged the first versions! There's a SerialOutputWAV class (forget exact name, not handy now) which writes a WAV file to the serial poer. I just loaded that into Audacity and compared it to the PC version.

earlephilhower commented 6 years ago

@Martin-Laclaustra - Just got a chance to look at that video. They're doing the same thing I mentioned before, which doesn't work: you can't connect the 1T output to anything other than a bare speaker. If it goes into any kind of amplifier (like for that LED speaker they were using) it will sound like that buzz or worse.

The 1T output is a binary signal at 0 or 5V, with nothing in between. When you connect to a 8ohm paper physical speaker directly, the speaker cone itself has inertia and acts as a low pass filter and averages the density of pulses in order to give a nice, analog output.

When you feed the 1T output to an amp you are alternatively grounding and overdriving the op-amp's input at a high frequency. That causes ringing and the opamp has a frequency response high enough to amplify the high frequency noise and you get that buzzing.

The same problem may happen with piezo speakers. They have a very high frequency response, normally, and have (almost) no inertia. So you hear the buzzing at high frequency.

You could attach the 1T output to a low pass and feed that into an amplifier. But at that point it is easier to just get an I2S DAC and avoid the whole thing (plus get stereo and true 16-bit output).

Martin-Laclaustra commented 6 years ago

@earlephilhower Thanks again! My setup is a simple speaker (earphones). They may have little inertia, but this is what I found:

It is not the signal that SAM generates. Sounds correct on PC. It is not a general problem of the circuit or your wav playing routine (at least not general) because viola.wav sounded OK. There must be some difference between viola and generated-voice that explain either a different mechanical behaviour of the circuit, or its processing by your routine.

Then I inspected the files in audacity, hex editor, and media info... and... we've got a suspect! Silences are 0x00 in viola and 0x80 in generated-voice, corresponding to a 16bits signed wav and a 8bits unsigned wav respectively. mediainfo of viola and generatedvoice.txt My guess is that the wav playing routine (AudioOutputI2SNoDAC) is outputting SAM signals (which are unsigned bytes) as either signed bytes or worse, signed 16-bits-ints. That would explain the sound. I apologize because I could not further check if you have a setting to tell the routine the kind of signal that it is going to be fed (This would mean an extra easy solution to the SAM bug, just adjust the routine settings), and perform various tests with 8bits and 16bits wavs.

Your work is amazing in both ESP8266Audio and ESP8266SAM! (I have ported the lesser quality voice ArduinoTTS but I was lacking the output part. May be I or we can get it going to provide an additional option to programmers. I have had this project -my SAM porting and the other TTS- sleeping for more than half a year! Just after I found your port last September and discovered the chirp problem I entered a busy period for this long)

earlephilhower commented 6 years ago

@Martin-Laclaustra well, you piqued my interest for sure. I appreciate the debugging effort!

I've hardwired (well, wirewrapped) a real I2S DAC and just tried the sample. It has the buzzing like you heard, even w/a real DAC. I am absolutely certain it did NOT have this problem when I last ran it.

Either some change in my lib or the Arduino core I2S (which I did rewrite, actually, a month or so ago) is causing this!

So now that I can reproduce it, and it's the same w/DAC as w/o that leads me to be believe the SAM implementation or the I2S 8->16 bit is hosed somehow.

I've just added a new SSL/TLS to the Arduino core (and a scanner driver for my new Canon scanner w/SANE and a fix to simple-scan w/GNOME) so I've been neglecting this.

Let me take a look at the code and see what I can find. It may take a few days, but it's now definitely on my radar.

I've never heard of ArduinoTTS. Is it a formant based synthesizer or something else? If you can generate samples in bunches, it's pretty simple to add an AudioGeneratorArduinoTTS class. The RTTTL code, for example, was something like 30 mins of effort once someone pointed me to it.

Will update when I find something...

MochiLata commented 6 years ago

Thanks for fixing this problem. I’m not as proficient as you guys, but I can definitely help with the testing.

Martin-Laclaustra commented 6 years ago

Thanks!

the SAM implementation or the I2S 8->16 bit is hosed somehow.

... or that conversion at the ESP8266Audio AudioOutput parent class (SetBitsPerSample function). Suspected because the 8bit wav file (already independent from SAM) played OK on PC but incorrectly from SPIFFS.

I will have a look at it too if I get the time.

ArduinoTTS. Is it a formant based synthesizer or something else?

Yes It relied in AVR timers and I hacked it to generate raw PCM and run in the PC. I will try to plug it in ESP8266Audio.

earlephilhower commented 6 years ago

Please update ESP8266Audio. Commit 0987b12e585c7f8a28d01cf5a751731101fd9442 fixes this.

Basically we do a subtract and multiply to go from u8->u16. But that's done on a passed in array of samples on each call to ->ConsumeSample. If ConsumeSample fails, the caller needs to try later (FIFO full). But on the next call, it'd perform the multiply and subtract again, leading to noise. Both I2S/I2SNoDAC had the same problem.

4-line fix in each file, and all's well!

Martin-Laclaustra commented 6 years ago

Test passed!

Thanks for your attention and dedication! I think you can close the issue.

earlephilhower commented 6 years ago

Thanks for the verification. Closing.