OpenNBS / OpenNoteBlockStudio

An open-source Minecraft music maker.
https://opennbs.org/
MIT License
728 stars 50 forks source link

Audio export sounds different than what you hear in the editor #452

Open Bentroen opened 1 month ago

Bentroen commented 1 month ago

Overview

Many users have pointed out that the NBS audio export sounds different/worse than what you hear in the editor. This is, in fact, true, and this post is intended to address in depth why this is the case.

It isn't completely understood what makes NBS editor audio playback sound the way it does, but there seems to be multiple factors at play — most of which are outside our control as they aren't handled in NBS's code, but by the underlying audio engine in GameMaker. But in general, we don't consider it a good representation of what a song is actually supposed to sound like.

nbswave, the library written to replace the previous DLL as the backend for audio export, is supposed to be the cleanest, purest form of audio you can get. There are no post-processing effects, compression, or equalization applied to the track, so everything sounds balanced overall.

This isn't the case in the editor, which has compression applied — in busier songs, you can audibly hear the song "dip down" heavily during playback. Sometimes, even this compressor struggles with holding everything back, as can be noticed in the following screenshot of a recorded editor session:

image (courtesy of @BedrRedstone)

Since NBS is made in a game engine, it's understandable that there's compression applied — afterall, you don't want players of your game to have their ears burst if your game is playing too many audio sources at once. However, for an audio editing application, it's desirable to have as much control as possible over what goes through to the user's speakers, which unfortunately is not the case with NBS today.

This is an unfortunate reality, since anyone working in any kind of music editing software expects exported files to be consistent with what they are hearing in the editor — basically, WYHIWYG. We're working to address these differences, and the research we've done prior to writing this post is part of understanding this process, as well as making it clearer to the community at large.

Of course, with the audio export being a first-class feature in NBS, and the primary way to get your creation out of the program, our end goal is for it to match what you hear in the editor exactly — or at least for there to be a known, reproducible way to apply certain effects to the export track so it matches the editor as close as possible. This involves both making the editor playback more consistent/predictable (something we expect to do in NBS v4.0), and improving the audio export to replicate its nuances.

If both the exported track's and the recorded playback's audio signals were identical, one would expect them to cancel out if their waveforms were lined up in an audio editor, then one of the tracks was inverted. Below is a list of (hopefully) all reasons that prevent this from happening.

List of differences between audio export (as of v3.10.0) and what is heard in the editor

Mitigating the problem

After a close analysis of the differences between a track exported with nbswave (with ±3 ms delay at random on each note) and NBS playback recorded with Audacity via WASAPI, we've found the following setup gets the exported track pretty close to the recorded version:

image (courtesy of @BedrRedstone)

Since the major problem seems to be related to the lack of compression and some frequencies being more or less present, a simple compressor and some EQ can get the exported track pretty close to what you hear in the editor.

NoteBlockTool, a different application with a bunch of useful features to transform NBS songs, seems to use a completely different approach to exporting the track: it plays the track through an audio backend (either OpenAL or javax) and actually records the samples. So we'd expect it to sound closer to what NBS does, since the GameMaker audio backend is a modified OpenAL. We couldn't compare it effectively with the other approaches yet, but it seems like a promising approach, with a simple way to get satisfying results.

Conclusion

Ultimately, we'd have to get knee-deep into the GameMaker/OpenAL/Windows audio APIs to be able to match the NBS playback with sample accuracy. This, of course, isn't realistically achievable, at least not without someone with deep understanding of the underlying audio mechanisms.

In light of that, I believe the best course of action right now is to, at the very least, reverse-engineer the audio output of each option on a high level to grasp what sounds good about it, then replicate it as close as possible externally. This should close the gap between what each version sounds like, and, hopefully, get both of them to sound closer to each other in the future.

time-killer-games commented 1 month ago

I was going to say I could port ENIGMA's/STIGMA's SDL_mixer audio system, but sadly it's GPLv3 and that is incompatible with GameMaker's proprietary runner. ENIGMA developers (and myself as the lone STIGMA developer) aren't enforcing the GPLv3 for anything besides competing products (game engines) so it should be fine under that circumstance.

Though IANAL.

BedrRedstone commented 1 month ago

jees, im suprised how quickly you basically whipped up a multi-paragraph essay in less than an hour