LMMS / lmms

Cross-platform music production software
https://lmms.io
GNU General Public License v2.0
7.98k stars 995 forks source link

TripleOscillator with hi-pass produces different output depending on the export settings #5749

Open zynskeywolf opened 3 years ago

zynskeywolf commented 3 years ago

Either TripleOscillator's noise generator or the Hi-pass filter has a bug, causing the exported file to sound differently with different export settings, which even includes whether you use the graphical interface or the command line. I made a project file on which the problem comes out very clearly, and exported it with a few different settings, using both gui and cli. The interpolation was "sincmedium" on every file. Here is the project and the exported wav files. Affected versions: current master and most probably 1.2.2 too.

zonkmachine commented 3 years ago

Confirmed. I see this effect on all hi-pass filters with 2x oversampling and the bug is in lmms-1.2.2 too.

he29-net commented 3 years ago

Here is what the spectrum looks like on my machine (top to bottom is _cli, _gui and original Triple Osc): screen

Looking at the original, it is clear that TripleOsc generates noise all the way up to 22050 Hz (since my sound card / system apparently runs at 44100 Hz sample rate and that's what LMMS uses).

When the project is exported at 48 or 96 kHz through GUI, it is exported correctly -- it sounds the same to me, and plotting the spectrum in Audacity shows the noise goes all the way up to the Nyquist limit. When the saved 48 kHz wav file is imported, LMMS apparently downsamples it to the current "active" sample rate (44.1 kHz) and as a part of this process it applies a ~20 kHz low-pass filter. That's why the second spectrum fades out after 20 kHz. But there should be almost no audible difference.

The low pass (fading) before (or during?) downsampling on import is intentional and necessary. If LMMS did not do it, the result would be what happens with the "_cli" versions -- the frequencies from 22.05 to 24 kHz (or 22.05 to 48 kHz) would be "reflected" down to the resulting spectrum due to aliasing.

I would guess that since LMMS does not need to open any audio device during CLI export, it will use some default sample rate as the "project sample rate", and something somewhere breaks and causes aliasing because of that. That's the only difference between CLI and GUI I can think of being the cause. I.e. maybe TripleOsc produces samples at 48 kHz, but before reaching the output they go back to 44.1 kHz without the necessary filter? And then the resulting garbage would be correctly upsampled to 48 Khz and saved? Something along these lines.. I don't have much of a clue about these deep internals, so no idea about anything specific.


The underlying problem, in my opinion, is that currently it is not possible to specify a sample rate for the project. At all. Whatever the system happens to be using is forced upon the user. That has been bothering me pretty much from the first time I opened LMMS: this approach is good for a simple media player (no configuration that the user could break), but not for audio processing software that should produce exact, repeatable and reliable outputs with technical parameters defined by the user.

I ended up ignoring it, since all my HW happens to "agree" on 44.1 kHz, but during the time I spent lurking around this tracker and LMMS Discord I saw quite a few complaints of the type "Why does my export sound differently than what I hear in LMMS?". Most often in connection to samples IIRC. So this may be just one of multiple issues caused by a single underlying "design flaw".

Ideally, LMMS should do all its internal processing in a fixed sample rate selected by the user and then resample if needed as the last step, before passing samples to the output (be it speaker or a file writing library). This way the output would be always consistent, no matter what machine LMMS is running on, no matter the export sample rate. But whether such a change would be possible without a major rewrite I have no idea.

PhysSong commented 3 years ago

TripleOscillator's noise oscillator is not band-limited, which makes the loudness of specific spectral region depend on the sampling rate.

zynskeywolf commented 3 years ago

The samplerate conversion would explain a lot of the problems, but as far as I know no such thing happens during export or live playback unless oversampling is set. There is no "project sample rate", but the instruments themselves work at the samplerate specified for the current operation. The default 44.1kHz is only used during playback and is changed to the specified value when the export starts.

Using jack, I changed the playback samplerate to 96kHz. The sound is correct when played live and the exported files are also the same as with 44.1. So I think there is no connection between playback and export in this sense.

I put a Glame low-pass filter on the track and turned the cutoff frequency all the way up. The difference is inaudible but it makes the exported file correct, so I aggree that the main problem is 3osc making a non-bandlimited signal. Also tested with a saw wave, without the low-pass it's aliasing like hell, but sounds perfectly fine with low-pass on it. It can also be the fault of the Sinc algorithm used, at least partially.

he29-net commented 3 years ago

By "project sample rate" I meant Engine::mixer()->processingSampleRate(). AFAIK, it is normally set to the system sample rate, and during export changes to whatever was selected as the output sample rate.

It is true that TripleOsc ideally should not produce any content over ~20 kHz, and there is already #4397 which fixes the issue for all waveform types except noise. My point was that even if we make sure all LMMS native instruments are band-limited, there are still external plugins or other possible sources of signal that goes above 20 kHz.

That's why I think the real problem could be in the way how sample rates are managed and converted -- if done properly, re-sampling to a different rate should not cause this sort of aliasing, even if it starts with signals that have high frequency content.