Speakers "squeal" when given values not equal to `0`

b0mbie commented 4 months ago

Minecraft Version

1.20.1

Version

1.111.0

Details

When given samples above or below 0, the speaker starts playing a really obnoxious, extremely high-pitched noise.

I initially spotted this when trying to synthesize sawtooth waves, but I reduced the code to just this:

-- Volume warning: This will destroy your ears (if you can hear it!).
local speaker = assert(peripheral.wrap("top"))

local buffer = {}
while true do
    for i = 1, 4096 do
        buffer[i] = -127
    end

    while not speaker.playAudio(buffer, 1) do
        os.pullEvent("speaker_audio_empty")
    end
end

What I discovered:

Values closer to 0 seem to reduce this noise's volume.
At 0, it does not appear to be playing at all.
At the extremes -128 or 127, the speaker also seems to be completely silent.

Video demonstration of this code (volume warning): https://github.com/user-attachments/assets/9803764d-65d7-4848-99f0-67138e17eca6

Plot of the squeal, put into Audacity for inspection: Ditto, but with the script changed to output 126 constantly:

The reason why I even spotted this is because the code below, which produces a sawtooth wave, seemed to have very prominent resonance, even though that was never programmed in:

local speaker = assert(peripheral.wrap("top"))

local RATE = 48000
local INV_RATE = 1 / RATE
local BUFFER_SIZE = 4096

local buffer = {}

local FREQ_HZ = 50
local value = 0

while true do
    -- Fill the buffer with a sawtooth wave in the range [-128; 127].
    for i = 1, BUFFER_SIZE do
        value = value + INV_RATE * FREQ_HZ
        if value > 1 then
            value = value - 1
        end
        buffer[i] = math.floor(value * 255 - 128)
    end

    while not speaker.playAudio(buffer, 1) do
        os.pullEvent("speaker_audio_empty")
    end
end

Plot of audio produced in-game: Below is plot of a sawtooth signal produced at the same frequency (50Hz):

exported to a WAVE file encoded in Unsigned 8-bit PCM at 48kHz, then
re-sampled to a WAVE file encoded in Signed 24-bit PCM at 48kHz.

The expected behavior is for speakers to:

not produce noise when given constant samples, and
not produce overtones when the original audio encoded with 8 bits does not also produce them during playback.

SquidDev commented 3 months ago

CC:T re-encodes audio to DFPWM when sending it from the client to the server. DFPWM uses a single bit per sample, and due to its design, isn't very good at holding a constant value at higher amplitudes.

We can use the cc.audio.dfpwm module to see what this gets resampled as.

local dfpwm = require "cc.audio.dfpwm"
local audio = {}
for i = 1, 1024 do audio[i] = -127 end

local resampled = dfpwm.decode(dfpwm.encode(audio))
print(table.concat(resampled, " ", #resampled - 8))
-- -122 -125 -81 -45 -79 -102 -115 -122 -125

This should correspond to your plots in audacity. As mentioned, DFPWM isn't very good at sustaining high amplitudes. That said, I'm a little surprised it fluctuates as much as it does — I'd need to plot some of the encoder internals to see what's going on.

b0mbie commented 3 months ago

Oh, I see! Thank you for the explanation.

I initially thought that speakers supported "true" PCM audio because of this quote on the tweaked.cc page "Playing audio with speakers":

CC: Tweaked's speakers also work with PCM audio. It plays back 48,000 samples a second, where each sample is an integer between -128 and 127. This is more commonly referred to as 48kHz and an 8-bit resolution.

This is partially true, since speaker.playAudio does accept PCM samples, but is misleading because the speaker does not play those same PCM samples. I think this would be something important to point out.

SquidDev commented 3 months ago

Plotting the various bits of encoder/decoder state, we can see what's going on here:

A plot of the encoder state (input, output, strength and charge). The charge swings wildly between -127 and 127, while the strength increases rapidly to its maximum.

Inside the DFPWM encoder/decoder, "charge" effectively models the output of the decoder (before the low-pass filter and antijerk is applied), while "strength" is used to control the rate at which the charge changes.

When we see a sample that differs to the current charge, the strength increases until it reaches the sample, at which point it decreases. Well, that's what's meant to happen. However, what actually happens is that we hit the desired value exactly, and so the strength increases more and more, causing the charge to flail more and more around the target value.

I'm not actually sure there's a good fix in the encoder/decoder here. Have documented this for now (7285c32d58aadf5037597471c28cacf1733cdae0), and have a bit more of a think.

cc-tweaked / CC-Tweaked