Effects_Buffer: Offset samples by frame size for multi-channel buffering

myQwil commented 9 months ago

If we request more than 1 frame of multi-channel audio, the buffer will use a method that involves buffering 2 samples at a time for each voice, The placement of these samples, however, is incorrect.

Each frame should consist of 1 sample per voice (we actually duplicate the sample for the sake of stereo upmixing, so if max_voices equals 8, the frame will consist of 16 samples in total).

Rather than placing the voice's samples right next to each other, we should instead give the sample(s) for the 2nd frame an offset of max_voices (times 2 for upmixing), in order to account for the rest of the voices that the full frame will consist of. If we don't do this, the two frames will become interlaced.

Reasons why this has gone unnoticed:

Multi-channel audio is a niche feature that hardly anyone uses.
In most cases, only 1 frame per request is needed.
Even with every 2 frames interlaced, it's not immediately obvious that anything is wrong with the audio, particularly 8-bit audio, unless it is either closely inspected or upsampled.

Wohlstand commented 9 months ago

I guess, you could implement this as a static inline function with reference arguments.

myQwil commented 9 months ago

Now that you mention it, I think the macro is going a bit beyond the scope of the original PR, so I'll just propose the most essential changes, as well as one redundant assignment change, which I'll put in a separate commit.

myQwil commented 9 months ago

Also, the "times 2 for stereo upmixing" involves this line:

out += max_voices * 4; //  * 2 for stereo, * 2 again for the two frames

It's not related to the *2 in [i*2+0] We remove those multiply operations because that was supposed to account for the fact that the samples of 1 voice for the next 2 frames were all grouped together, but now we're not grouping them together, so we don't need to multiply i by 2 anymore.

Wohlstand commented 9 months ago

Okay, the last thing is left is adding of the comment that output format has been changed since the version 0.6.4, or? Or treating they this function before 0.6.4 was an invalid, and it outputs the invalid data (i.e. buggy). What do you think? Also, @sezero, what do you think about this?

sezero commented 9 months ago

Okay, the last thing is left is adding of the comment that output format has been changed since the version 0.6.4, or?

Changed output format is not something to take lightly.

As I understand it (I haven't read the issue carefully), the problem only shows itself with multi-channel files? If that is accurate, I would suggest that we keep the original behavior of gme_play intact and add a new gme_play_multichan to the api to address this.

myQwil commented 9 months ago

I made an example so that you can see the issue firsthand. You'll notice that the audio doesn't sound overtly terrible, but take a closer look at it in a program like Audacity and tell me what you see:

demo/basics.c:

/* C example that opens a game music file and records 10 seconds to "out.wav" */

#include "gme/gme.h"

#include "Wave_Writer.h" /* wave_ functions for writing sound file */
#include <stdlib.h>
#include <stdio.h>

void handle_error( const char* str );

int main(int argc, char *argv[])
{
    const char *filename = "test.nsf"; /* Default file to open */
    if ( argc >= 2 )
        filename = argv[1];

    int sample_rate = 44100; /* number of samples per second */
    /* index of track to play (0 = first) */
    int track = argc >= 3 ? atoi(argv[2]) : 0;

    /* Open music file in new emulator */
    gme_type_t file_type = gme_identify_extension( filename );
    Music_Emu* emu = gme_new_emu_multi_channel( file_type, sample_rate );
    handle_error( gme_load_file( emu, filename ) );

    /* Start track */
    handle_error( gme_start_track( emu, track ) );

    /* Begin writing to wave file */
    wave_open( sample_rate, "out.wav" );
    wave_enable_stereo();

    const int voices = 8;
    const int frames = 64;
    const int buf_size = frames * voices * 2;

    /* Record 10 seconds of track */
    while ( gme_tell( emu ) < 10 * 1000L )
    {
        /* Sample buffer */
        short buf [buf_size], *in = buf, *out = buf;

        /* Fill sample buffer */
        handle_error( gme_play( emu, buf_size, buf ) );

        /* Render frames for 2-channel output */
        for ( int f = frames; f--; out += 2 ) {
            int l = 0, r = 0;
            for ( int v = voices; v--; in += 2 ) {
                l += in[0];
                r += in[1];
            }
            out[0] = l / 2;
            out[1] = r / 2;
        }

        /* Write samples to wave file */
        wave_write( buf, frames * 2 );
    }

    /* Cleanup */
    gme_delete( emu );
    wave_close();

    return 0;
}

void handle_error( const char* str )
{
    if ( str )
    {
        printf( "Error: %s\n", str ); getchar();
        exit( EXIT_FAILURE );
    }
}

myQwil commented 9 months ago

When you zoom in at the level of individual samples, you'll notice a zigzag pattern.

And when you apply this PR's changes to Effects_Buffer.cpp, you'll notice that the zigzag pattern goes away.

myQwil commented 9 months ago

Also, @sezero to address your concern about how this will affect non multi-channel output, it will behave exactly the same as before.

For multi-channel output, max_voices is usually 8, but for non multi-channel output, all the voices are combined into 1 voice, so max_voices just equals 1.

with the previous logic, you would have: i 2 + 0 = 0 i 2 + 1 = 1

with the new logic, you have: i = 0 i + max_voices = 1

Bear in mind that the outer for loop is as follows: for(int i=0; i<max_voices; i++)

So with max_voices equaling 1, i will never increment past 0 when dealing with non multi-channel output.

sezero commented 9 months ago

(i) If single channel audio won't be affected with this, and: (ii) if multi-channel has practically never been attempted,

.. then:

I won't object to this patch (I don't have a vote here: not a maintainer -- just expressing my opinion.) The new versus old behavior change for multi-channel must be well documented in the release notes, too.

myQwil commented 9 months ago

Regarding whether we should treat this as a new format, or if we should treat the previous format as invalid, I would say the latter is true. The previous format was invalid (albeit in only very specific cases) since the intended format was 16-channel interleaved audio frames.

We would get correct results when requesting 1 frame at a time, but not when requesting more than 1 frame, so even if we tried making the argument that the old format wasn't wrong, just different, the fact of the matter is that the results were inconsistent depending on how many frames were requested.

sezero commented 9 months ago

OK, up to @Wohlstand now

P.S.: Off-Topic: @Wohlstand: Have you had a chance to review/test the two pull requests at bitbucket site? (Or, is your account at bitbucket dead? If it is, should I re-create them here, if this repo is finally official and distros are looking for libgme here?)

Wohlstand commented 9 months ago

P.S.: Off-Topic: @Wohlstand: Have you had a chance to review/test the two pull requests at bitbucket site? (Or, is your account at bitbucket dead? If it is, should I re-create them here, if this repo is finally official and distros are looking for libgme here?)

Didn't checked yet, I wanted to pull that and verify locally, anyway, I am more prefer to hurry with moving to GitHub and gradually abandon the old repo. Bitbucket is very slow for me, even without everything. My account is still alive, but there is a risk that it would become dead at any moment.

sezero commented 9 months ago

Didn't checked yet, I wanted to pull that and verify locally

OK, will wait

Wohlstand commented 9 months ago

I'll try to verify this today at evening (UTC+3) and I'll merge this with adding documentation note about this.

myQwil commented 9 months ago

If I change the PR at bitbucket to look exactly like this one, could you merge it there instead?

Wohlstand commented 9 months ago

I prefer to do it here, as on Bitbucket I'll lose the access soon (because of unfair reasons).

myQwil commented 9 months ago

I'll change it to look like this one, and if you lose access to your account within that time, this PR will still be here.

Wohlstand commented 4 months ago

Okay, going to merge this now and update the documentation to note about this change.

libgme / game-music-emu

Effects_Buffer: Offset samples by frame size for multi-channel buffering #54