earlephilhower / ESP8266Audio

Arduino library to play MOD, WAV, FLAC, MIDI, RTTTL, MP3, and AAC files on I2S DACs or with a software emulated delta-sigma DAC on the ESP8266 and ESP32
GNU General Public License v3.0
2.01k stars 432 forks source link

AudiogeneratorMOD: use full 16bit with slightly better accuracy #479

Closed softhack007 closed 2 years ago

softhack007 commented 2 years ago

Hi, This is a follow-on to my previous PR. I did a few tweaks that improve the "real" resolution (from 10bit to 12-14bit). I have also enabled 8-channel playback, however this means that many internal structures are now 2x the size - please check if this is OK from memory usage perspective.

I only have internalDAC so I could not check if its an audible improvement. However I verified that the wavform (WAV file output) looks similar (see attached picture). If you look closely at the frequency (spectral) plot, you may notice that there is less "blue noise" in the higher frequencies, as consequence of the more accurate interpolation of instrument samples.

compare_waveform_and_spectrum

softhack007 commented 2 years ago

PS: sorry about the excessive commit list - I merged my own "work in progress" branch into a fresh branch, then deleted some debugging stuff, etc - as result git listed all these previous commits. Actually the diffs shown as "files changes" are the ones I wanted in this PR.

earlephilhower commented 2 years ago

PS: sorry about the excessive commit list

Thanks, and no worries, the git diff is all that matters because we do squash-and-merge here (so all commits in a PR become a single one in master).

I'm a little worried about the scaling difference, though. Your "after" waveform is obviously 2x or 4x higher volume than the "before." Are you sure that something in the scaling isn't broken here? I see your comments about pre-shifting to optimize things, but it still seems a little worrysome.

I'll look at where CHANNELS is used to see if there's any real memory issue on the ESP8266. Another option would to be #ifndef CHANNELS / #define CHANNELS 4 / #endif potentially...

softhack007 commented 2 years ago

I'm a little worried about the scaling difference, though. Your "after" waveform is obviously 2 or 4 times higher

Actually that's expected, because the previous code was producing relatively low volume. By going from BITDEPTH=15 to BITDEPEPTH=16, each sample is effectively scaled up by 2 - to the +plus or -minus direction respectively. I've checked some internal variables in the mixer code, and they are within expected ranges.

softhack007 commented 2 years ago

Hi, I did some more testing. Actually the scaling is working as expected, and I can reproduce the previous waveform volumes by changing BITDEPTH=16 back to BITDEPTH=15. So this it OK, and BITDEPTH=16 does what it should do.

Now comes the "but". Maybe your feeling about my "pre-shifting optimization" was right. Re-thinking the math, the new optimisation can loose some accuracy. In integer math, "(3+1) / 2" is 2. However "3/2 + 1/2" results in 1 (truncation) or 3 (rounding).

Maybe its better if I remove the pre-shifting optimisation part, and we just keep the extra 2 bit gained in sample interpolation. What do you think?

earlephilhower commented 2 years ago

Now comes the "but". Maybe your feeling about my "pre-shifting optimization" was right. Re-thinking the math, the new optimisation can loose some accuracy. In integer math, "(3+1) / 2" is 2. However "3/2 + 1/2" results in 1 (truncation) or 3 (rounding). Maybe its better if I remove the pre-shifting optimisation part, and we just keep the extra 2 bit gained in sample interpolation. What do you think?

Good observation. As you noted, the preshifting ends up adding noise and reducing the effective bit depth. So I think your idea to revert that bit makes good sense. Plus, given all the other operations to generate each sample, optimizing a shift or two out of it is not really going to impact things, I think.

softhack007 commented 2 years ago

agreed 👍

softhack007 commented 2 years ago

What's your opinion on increasing CHANNELS from 4 to 8?

Many of the "MODs" in the wild need 4 channels. We can also play 8 channel mods, and even 12 channels worked for me (ESP32 @ 240Mhz, 480Kb RAM). However smaller devices may struggle with CPU power or RAM needed. In addition to 2 times (private) data for the class, each channel needs its on "FatBuffer".

earlephilhower commented 2 years ago

CPU will be fine on the 8266 (160MHz is normally what it runs at for this library), but DRAM will be an issue as I look at it especially since the major structures include large static NxN arrays (heap fragmentation could kill things even w/lots of free mem)...

Probably best is to bracket the setting:

#ifdef ESP8266 // Not sure if C3/C2 have RAM constraints, maybe add them here?
#define CHANNELS 4
#else
#define CHANNELS 8
#endif
softhack007 commented 2 years ago

Yes, makes sense. Just did that.