godotengine / godot-proposals

Godot Improvement Proposals (GIPs)
MIT License
1.16k stars 97 forks source link

Deprecate low-quality/limited IMA-ADPCM compression from the WAV importer #4264

Open Calinou opened 2 years ago

Calinou commented 2 years ago

Related to https://github.com/godotengine/godot-proposals/issues/1144.

Describe the project you are working on

The Godot editor :slightly_smiling_face:

Describe the problem or limitation you are having in your project

Godot supports several audio formats:

WAV supports lossy compression via IMA-ADPCM, which is a compression technique originally developed in the 1970s:

image

Unfortunately, this compression is considered very low-quality by today's standards, with Ogg Vorbis or even MP3 performing significantly better in terms of quality. Unlike Ogg Vorbis and MP3, IMA-ADPCM compression also doesn't offer any way to balance between file size and quality with a bitrate control.

In Godot, IMA-ADPCM compression also doesn't support playing audio from a given position.

Describe the feature / enhancement and how it helps to overcome the problem or limitation

Deprecate the IMA-ADPCM encoder and its associated import option in the WAV import dock, but keep the IMA-ADPCM decoder for run-time WAV loading once it's implemented. There are still WAV files out there with IMA-ADPCM compression, and loading those should remain supported.

Any CPU released in the last 12 years can decode several Ogg Vorbis or MP3 streams at the same time (even on mobile), so IMA-ADPCM's comparatively faster decoding speed is not relevant anymore.

Describe how your proposal will work, with code, pseudo-code, mock-ups, and/or diagrams

Since Godot 4.0 was released after this proposal was opened, this proposal now only suggests deprecating rather than removing IMA-ADPCM to avoid breaking compatibility.

If this enhancement will not be used often, can it be worked around with a few lines of script?

No.

Is there a reason why this should be core and not an add-on in the asset library?

This is about removing an option from a core importer.

This proposal is about avoiding "footguns" in terms of audio quality, by steering users towards higher quality audio compression options when needed.

reduz commented 2 years ago

IMA ADPCM is super cheap to decode and resample, orders of magnitude more than mp3 or ogg. Not a good idea to remove.

You can have hundreds of these decoding at no cost.

USBDongleGoblin commented 2 years ago

I think a better idea is to just improve/fix/replace the encoder & decoder all together. No (or significantly less) compatibility breaking here.

For some reason Godot's encoder (atleast in versions up to and including 3.4.4) in my experience produces results ranging from fine/acceptable, to "this literally sounds like the file was corrupted in export." Some sampling rates tend to sound 8 times worse than they should too (I.E. 32khz sounds much more like 16khz to me).

If you export a file with IMA-ADPCM from Audacity, and compare it to a file encoded via Godot, the file from Audacity is going to sound much better.

I've included a sample project (using a voice line from Half-Life 2 if that's ok) which shows what I'm talking about (although believe it or not, this is one the less egregious instances I've encountered. If I find a sound that causes this, I'll share it). At the start of the IMA-ADPCM encoded sounds there is some very obvious popping.

I've also included some files exported from Audacity to show how the files encoded by Godot should probably sound much more like.

adpcm sucks.zip

adpcm DOESN'T suck.zip

I am also wondering if it might be a good idea to basically just use a different ADPCM format altogether (maybe even a custom one lol).

Calinou commented 2 years ago

reduz recently suggested that we look into adding support for FLAC decoding. This would provide the same quality as uncompressed WAV, but with smaller file sizes. This would also benefit games that can load user-provided music (as FLAC has been increasingly replacing MP3 in people's local music libraries). There's a BSD0-licensed library for it: https://github.com/jprjr/miniflac

I don't know how much CPU FLAC decoding requires though – probably more than IMA-ADPCM, but maybe less than Ogg Vorbis?

Calinou commented 2 years ago

Note that lossy WAV is an alternative that can be explored if Ogg Vorbis or even MP3 decoding are too expensive in your project. It's compatible with any program that can read WAV files, without requiring dedicated support.

caimantilla commented 1 year ago

I think a better idea is to just improve/fix/replace the encoder & decoder all together. No (or significantly less) compatibility breaking here.

For some reason Godot's encoder (atleast in versions up to and including 3.4.4) in my experience produces results ranging from fine/acceptable, to "this literally sounds like the file was corrupted in export." Some sampling rates tend to sound 8 times worse than they should too (I.E. 32khz sounds much more like 16khz to me).

If you export a file with IMA-ADPCM from Audacity, and compare it to a file encoded via Godot, the file from Audacity is going to sound much better.

I've included a sample project (using a voice line from Half-Life 2 if that's ok) which shows what I'm talking about (although believe it or not, this is one the less egregious instances I've encountered. If I find a sound that causes this, I'll share it). At the start of the IMA-ADPCM encoded sounds there is some very obvious popping.

I've also included some files exported from Audacity to show how the files encoded by Godot should probably sound much more like.

adpcm sucks.zip

adpcm DOESN'T suck.zip

I am also wondering if it might be a good idea to basically just use a different ADPCM format altogether (maybe even a custom one lol).

I just thought it might be because the audio is unfiltered, and that's indeed the case. The encoding is fine, it's a bug. I tested by using IMA-ADPCM compression on a 44.1khz song. With my system sound set to 48khz, it sounds awful, but with the system sound set to 44.1khz (so no resampling), it sounds totally fine. Godot uses Cubic resampling for all other formats, doesn't it? I even tried using zero order hold for the same track in Foobar2000, it sounds almost identical (sans the actual compression, so a little better).

Anyways, sorry for rambling, my final comment is that this leads to a slightly cleaner sound when max rate is set to 24khz than off. I think that the audio filter should be an import option if possible, if not then just fix IMA-ADPCM to use cubic.

unfa commented 1 year ago

If FLAC import was added, I'd use that instead of WAV just to save space in the game repository.

fire commented 1 year ago

@unfa I posted https://github.com/godotengine/godot/pull/80160 pull request for FLAC import.

DeeJayLSP commented 1 year ago

I think a better idea is to just improve/fix/replace the encoder & decoder all together. No (or significantly less) compatibility breaking here.

I'd like to mention the existence of ADPCM-XQ, developed by the creator of WavPack.

It uses dynamic noise shaping to reduce quantization noise and supports encode lookahead, where samples can go from 0 to 8 which could be exposed as an import option.

Even at level 0 it manages to get less audible artifacts than Godot or FFmpeg's implementations. At 5 samples (although going above that occasionally takes forever to encode), the result is usually more faithful to the original when compared to those. And there seems to be no size penalty by using it.

The encoding result is readable by any program that supports adpcm_ima_wav.

One thing to note, however, is that the repo's README mentions some bugs. But with a bit of adjustments, I think this can be integrated into Godot.

DeeJayLSP commented 1 year ago

In addition, I should mention the QOA format.

Finished in April 2023, it is a lossy format specifically designed as an alternative to ADPCM:

On the other hand, I don't see any popular software currently adopting the format, likely because it's quite new.

Calinou commented 1 year ago

QOA looks pretty interesting to me, especially if we manage to integrate an encoder in Godot. This way, you can compress your WAVs to QOA on import (and keep your original lossless audio for iterating upon). This means third-party app support doesn't matter here, as only Godot needs to encode and decode the output QOA files.

fire commented 1 year ago

My proposal for FLAC support was rejected so I don't know if QOA will be accepted.

DeeJayLSP commented 1 year ago

Offtopic

Is anyone interested in a custom QOA module (to read QOA files, not convert from WAV)? I kinda tried to write one but couldn't get playback instance and mixing to work.

Available here.

DeeJayLSP commented 1 year ago

Offtopic (part 2)

The QOA module I mentioned above is now somewhat functional if anyone's interested πŸ˜ƒ.

USBDongleGoblin commented 1 year ago

Offtopic (part 2)

The QOA module I mentioned above is now somewhat functional if anyone's interested πŸ˜ƒ.

Following the documentation's guide, I was able to compile the engine on it's own, but not with the module. I might try with Visual Studio instead of SCons or maybe even Linux to see if that works.

An encoder should definitely be included due to QOA's current lack of support and for the reasons mentioned by @Calinou. Could also be a good idea to make the engine re-encode all IMA-ADPCM encoded files as QOA if no one fixes IMA-ADPCM decoding.

darksylinc commented 1 year ago

Every time I have an issue w/ ADPCM I end up here, so I may as well chime in w/ some experience (about a project NOT using Godot):

We use ADPCM on most SFX where it doesn't matter, like engine sounds, explosions, bullet pew pew, etc. For music we use Opus.

We use adpcm_xq for encoding.

We didn't try QOA, but sounds interesting and a better alternative that ticks the same checkboxes.

Any CPU released in the last 12 years can decode several Ogg Vorbis or MP3 streams at the same time (even on mobile), so IMA-ADPCM's comparatively faster decoding speed is not relevant anymore.

This is "technically true, BUT..." on mid & low end mobile, OggVorbis/Opus decoding takes around 3-10% of CPU time per stream. This means:

  1. 16 Channels at the same time on a single core in a low end phone may mean audio starts popping because we can't decode fast enough (more channels is irrelevant because it becomes a cacophony)
  2. A much bigger issue is battery. Even at 3% per stream, 16 channels means 48% of a single CPU allocated to audio decoding. That burns battery and makes the phone feel warm to touch. For JUST SOUND EFFECTS (that most phone users will likely play on mute anyway).
  3. WAV is an obvious alternative for this SFX, but on mobile you're always memory constrained (due to mobile not having swap file), so it's nice to have a 0.25x RAM footprint.

In Godot, https://github.com/godotengine/godot/issues/18878.

ADPCM is a fixed-size block format. Which means arbitrary seeking is really easy to calculate back and forth.

Only special care needs to be taken for the last block because e.g. an audio file with 2000 samples and 1024 samples per block means the last block is gonna have 2000 - 1024 = 976 samples of useful data and 48 samples of garbage.

More importantly, if the audio files has 1500 samples, then the first block will have 1024 samples, but the last one will use 512 samples instead of 1024.

Here's example code where the wanted sample is at mCurrIndex.

/// The returned value is independent of the number of channels.
/// e.g. if the input has 505 samples, it returns 505 whether it's stereo or mono
inline size_t getImaNumSamplesPerBlock( const size_t blockSizeBytes, const size_t numChannels )
{
    return ( blockSizeBytes - 4u * numChannels ) * ( numChannels ^ 3 ) + 1u;
}

/// The inverse of getImaNumSamplesPerBlock.
/// (numSamples - 1) must be multiple of 8
inline size_t getImaBlockSize( const size_t numSamples, const size_t numChannels )
{
    return ( numSamples - 1u ) / ( numChannels ^ 3 ) + ( numChannels * 4 );
}

const size_t samplesPerBlock =
    AudioFormats::getImaNumSamplesPerBlock( mBytesPerBlock, mNumChannels );

const size_t samplesInLastBlock = mNumFrames % samplesPerBlock;

const size_t framesLeft = mNumFrames - mCurrIndex;

const size_t compressedBlockIdx = mCurrIndex / samplesPerBlock;
const size_t uncompressedOffset = mCurrIndex - ( compressedBlockIdx * samplesPerBlock );

size_t blockSizeBytes = mBytesPerBlock;
if( framesLeft <= samplesInLastBlock )
{
    const size_t lastBlockNumAdpcmSamples = ( ( samplesInLastBlock + 6u ) & ~7u ) + 1u;
    blockSizeBytes = AudioFormats::getImaBlockSize( lastBlockNumAdpcmSamples, mNumChannels );
}

adpcm_decode_block( decodedFrames, mData + compressedBlockIdx * mBytesPerBlock, blockSizeBytes,
                    mNumChannels );

const int16_t * inputBuffer16 = decodedFrames + uncompressedOffset * mNumChannels; // Decoded data is here
DeeJayLSP commented 8 months ago

After a long time I'd like to show my support for a QOA encoder.

Since it's a different encoding format from WAV it could be implemented as an alternative importer, similar to how there are different importers for image files (Texture2D, Texture3D, etc.).

However, I don't see how encoders can be implemented without duplicating a lot of WAV import code (after all you need to ensure all WAV formats that can be imported can also be converted). If it's done this way, however, the rest is a simple link to QOA's internal encoder (that is, if I understood the code correctly).

DeeJayLSP commented 8 months ago

QOA looks pretty interesting to me, especially if we manage to integrate an encoder in Godot.

I might have good news.

(There is a limitation however: only 16-bit WAV can be imported this way as it uses the reference implementation's method. We might be able to convert from those types to 16-bit so it goes through.) Limitation fixed.

caimantilla commented 8 months ago

QOA looks pretty interesting to me, especially if we manage to integrate an encoder in Godot.

I might have good news.

(there is a limitation however: only 16-bit WAV can be imported this way)

Is it out of the question to reduce the bit depth before encoding?

DeeJayLSP commented 8 months ago

Is it out of the question to reduce the bit depth before encoding?

Godot itself converts a WAV file from all supported bit depths to 16 bit while importing (except 8-bit), so it's just a matter of spending time to implement it. I was just focused on making the basic converter work.

DeeJayLSP commented 8 months ago

I have updated my custom module to support importing WAV 16-bit into QOA, just in case someone wants to try out.

Wondering if this would make a good addition into Godot itself. Maybe deprecating IMA-ADPCM would make sense then. One caveat is the fact the QOA sources have to be heavily modified for use within the engine.

It adds around 16KB in release template binary size. By my calculations, in case one uses common WAVs (16-bit, 44.1kHz, 2 channels), the size penalty is compensated once at least 0.13s of audio gets converted.

If there's support, I might open a PR with some changes.

DeeJayLSP commented 6 months ago

How would deprecating IMA-ADPCM work? Throw a warning? Put its code within ENABLE_DEPRECATED? Both?

Calinou commented 6 months ago

How would deprecating IMA-ADPCM work? Throw a warning? Put its code within ENABLE_DEPRECATED? Both?

I suggest doing both: