mackron / dr_libs

Audio decoding libraries for C/C++, each in a single source file.
Other
1.24k stars 205 forks source link

question: proper use of library #182

Closed VincentGijsen closed 3 years ago

VincentGijsen commented 3 years ago

Hi,

I wantented to ask if someone can quickly answer if use the DR_FLAC.h correctly, as i'm getting strange results, which i cannot explain really.

so basically i use the library in a uC environment (stm32f407), with STM's libs. after lot of fiddeling around with memory constrains, i am able to decode flac audio from the sd-card, and stream it over USB to the host-pc, however, it sounds like, its playback rate is 2x (at least), and it would seem, the audio comes in 'squeezed' in amplitude.

Hopefully someone is able to provide some pointers where to look. onething toknow about USB-audio, is that i use FullSpead, and synchronize the sending of data to exactly 1ms via the ISOSYNCRHONIOUS pipe.

so my init function roughly does the stuff below, and the callbacks are implemented accordingly. It returns successfully, indicating all is fine.

...
FLAC_HANDLER_STATUS drFlac_play(FIL *file) {
    if (!drFlac_allocationConfigured) {
        _setupDr_flac();
        drFlac_allocationConfigured = 1;
    }
    drFlac_stop();

    //reset decoder stufs;
    myData.currentFile = file;
    myData.aditional_frames_on_interval = 0;
    myData.framecnt = 0;
    myData.interval_for_extra_frames = 0;
    myData.bytes_per_sample = 0;

//TODO: implemeent stop/clear check before start

//setup 'fresh' decoder'
//drflac_opecurn_memory_with_metadata(pData, dataSize, drFlacMetaCallBack, &myData, &allocationCallbacks);

    myData.pFlac = drflac_open_with_metadata_relaxed(_drflac_onRead,
            _drflac_onSeek, _drflac_onMeta, drflac_container_native, &myData,
            &allocationCallbacks);

    if (myData.pFlac) {
        return FLAC_DECODER_READY;
    } else {

    }
    return FLAC_DECODER_ERROR;

...

then, i fill up a buffer, having n-slots of 1ms of data

...
FLAC_PCM_STATUS drFlac_updatePCMBatch() {

    switch (songStreamProperties.freq) {
    case FLAC_HZ_44100:
        myData.framesToRead = FLAC_SAMPLES_MS_44100_16BIT;
        myData.aditional_frames_on_interval =
        FLAC_SAMPLES_MS_44100_16BIT_EXTRA_EACH_FRAMES;
        myData.interval_for_extra_frames =
        FLAC_SAMPLES_MS_44100_16BIT_EXTRA_EACH_FRAMES_TO_INSERT;
        myData.bytes_per_sample = FLAC_SAMPLES_MS_44100_16BIT_BYTES_PER_SAMPLE;
        break;

    case FLAC_HZ_48000:
        myData.framesToRead = FLAC_SAMPLES_MS_48000_16BIT;
        myData.aditional_frames_on_interval =
        FLAC_SAMPLES_MS_48000_16BIT_EXTRA_EACH_FRAMES;
        myData.interval_for_extra_frames =
        FLAC_SAMPLES_MS_48000_16BIT_EXTRA_EACH_FRAMES_TO_INSERT;
        myData.bytes_per_sample = FLAC_SAMPLES_MS_48000_16BIT_BYTES_PER_SAMPLE;
        break;

    default:
        //error:
        return FLAC_PCM_STATUS_ERROR;
        ;
    }
    uint8_t frames_to_get = myData.framesToRead;

    //additional frame logic
    if (myData.interval_for_extra_frames > 0) {
        myData.framecnt++;

        if (myData.framecnt == myData.interval_for_extra_frames
                && (myData.framecnt > 0)) {
            myData.framecnt = 1; //carefull, 0 is reserved for disabled;
            frames_to_get += myData.aditional_frames_on_interval;
        }
    }
    uint16_t samplesProvided = drflac_read_pcm_frames_s16(myData.pFlac,
            frames_to_get, flacPCMBuffer);

    //check result of decoding run
    if (samplesProvided == 0) {
        return FLAC_PCM_STATUS_ERROR;
    } else if (samplesProvided < frames_to_get) {

        //set rest of buffer to zero
        memset(flacPCMBuffer[samplesProvided], 0,
                (frames_to_get - samplesProvided));
//setnd FLAC_PCM_STATUS_PARTIAL
    } else {
        //all samples filled

    }

    putBuffer(&flacPCMBuffer, frames_to_get, myData.bytes_per_sample);
    return FLAC_PCM_STATUS_FILLED;
    //}

//other bit stuff not imlemented
    //return FLAC_PCM_STATUS_ERROR;

}
...

void _guard_write_and_increment() {
    if ((_buffer.pWrite + 1) == _buffer.slots) {
        _buffer.pWrite = 0;
    } else
        _buffer.pWrite++;

}
...
BUFFER_RESULT putBuffer(uint16_t *pcmSamples, uint8_t len,
        uint8_t bytesPerSample) {
    if (!_buffer.Isinitialized)
        return BUFFER_ERROR;

    //te first time, we start at frame 1
    _guard_write_and_increment();

#define MSB (pcmSamples[x] & 0xff)
#define LSB ((pcmSamples[x] >> 8)&0xff)

    for (uint8_t x = 0; x < len; x++) {
        _buffer.data[_buffer.pWrite].frame[(x * 2)] = MSB;
        _buffer.data[_buffer.pWrite].frame[(x * 2) + 1] = LSB;

    }
    _buffer.data[_buffer.pWrite].len = (len * bytesPerSample); //we doubled # of frames as 16bit ->2x 8bit

    return BUFFER_OK;
}
...

this seems to fill up the buffers each-time a slot of 1 ms is available.

So what i want to ask specifically, if that the reasoning below is ok:

the usb-bus uses uint8_t bytes, so each pcm-sample (in 16bit) is 2 bytes.

so each run, i ask for (FREQ * CHANNELS * 2[bytes]) /1000 of PCM samples, for a 44.1khz stream, this implies 44100 * 2 * 2 / 1000 = 88 samples (rounded), and every 10 frames, i add an aditional two, to re-sync; keeping in mind that every 1ms these 88 frames are send. For 48khz there is no rounding issue, and we transmit 96 samples every 1ms.

when transmitting these uint16_t via usb, i convert them to uint8_t and basically double the package-leng (as seen in the putBuffer(...);

It seems the stereo seperation works, so i gues,s the PCM-packages, decoded by DR_FLAC.h are uint16_t right/left interleaved, its just to fast, and when played at half-speed, its sounds crippled, yet i can recognize the song.

the only thing i could tink-off is that the HOST polls the devices not every 1ms, but more frequently, but so far i haven't seen indication that this happens. Perhaps someone has some pointers for me ?

on a side note: would it be labour-intensive (for me) to convert the malloc stuff to static buffers, in embedded systems, this is far more desired, its easier to prevent stack-corruption

below a capture, where i would expect more 'amplitude', but i guess it's not very informtive. Capture

mackron commented 3 years ago

Sorry for not replying sooner. So dr_flac will return all sample data interleaved. So in a stereo sound, the first int16 in the output buffer will be the left channel of the first frame and the second int16 will be the right channel of the first frame. The the next two int16s will be for the second frame, etc. Also, you mentioned uint16_t - dr_flac actually returns int16_t samples.

Something I noticed in your code is this line:

myData.bytes_per_sample = FLAC_SAMPLES_MS_44100_16BIT_BYTES_PER_SAMPLE;

That looks strange to me, because the number of bytes per sample should not depend on the sample rate. This should always be set to 2 in all cases if you're using the drflac_read_pcm_frames_s16().

In putBuffer() it looks like you're converting from 16-bit to 8-bit? If that's the case, I don't think it's correct. You need to use arithmetic to convert between the two rather than a mask. This is what I use in miniaudio:

for (i = 0; i < count; i += 1) {
    ma_int16 x = src_s16[i];
    x = (ma_int16)(x >> 8);
    x = (ma_int16)(x + 128);
    dst_u8[i] = (ma_uint8)x;
}

So to summarise, you want to check that you're handling the interleaving properly and that you fix your 16-bit to 8-bit conversion. Then we can proceed from there.

For your malloc vs static thing, you can implement custom allocation callbacks. Look for drflac_allocation_callbacks in dr_flac.h and you should be able to figure out how it works. You could implement a custom allocation function which draws it's memory from a custom statically allocated buffer. But otherwise, no, there's no way to change that.

VincentGijsen commented 3 years ago

Hi David,

No worries,already happy that you took the time to take a peek at my code snippets.

I can imagine given my choice of naming naming made you question that logic

myData.bytes_per_sample = FLAC_SAMPLES_MS_44100_16BIT_BYTES_PER_SAMPLE;

The intent is to calculate the number of bytes per USB transaction. As such ISOtransaction occurs every 1millisec, and is expressed in bytes, 48khz audio rate implies 48000 16 bit x2 chan data a sec. So every 1millisec, 48 samples ×2channels of each, 16bits.

As a usb package sends uint8, that turns into 96 int16 times 2. With that reasoning you can see that at 44.1khz we have a different number of bytes per ms in the transmission.

That is also what you spotted at the putBuffer function. Raw conversion of 16bit into two uint8.

The usb-audio Class also interleaves pcm samples , so that should implicitly unchanged I would think.

So far I would thing things are as they should, at least to my understanding of dr_flac and usb specs. There isn't a minimal number of pcm frames I must request I assume?

As you can see, typically I request 88 or 96 samples (==1ms of audio), and out them into a buffer.

Ps sorry for the crap typing. I'm on my phone .

And again thanks for looking into this already.

I cannot explain why the audio, sounds like it's played back at 2 to 3 times speed, and the ranges/amplitude seems compressed. Perhaps there is still an issue with the int16 to 2x uint8. But that doesn't explain to me as the fast playback phenomenon.

Ill try to upload the full code later this week, was going to do that anyways, should you have some time to kill, and nothing better to do. However I fully understand If you have plenty of other things to be bothered about.

On Tue, Mar 9, 2021, 12:33 David Reid @.***> wrote:

Sorry for not replying sooner. So dr_flac will return all sample data interleaved. So in a stereo sound, the first int16 in the output buffer will be the left channel of the first frame and the second int16 will be the right channel of the first frame. The the next two int16s will be for the second frame, etc. Also, you mentioned uint16_t - dr_flac actually returns int16_t samples.

Something I noticed in your code is this line:

myData.bytes_per_sample = FLAC_SAMPLES_MS_44100_16BIT_BYTES_PER_SAMPLE;

That looks strange to me, because the number of bytes per sample should not depend on the sample rate. This should always be set to 2 in all cases if you're using the drflac_read_pcm_frames_s16().

In putBuffer() it looks like you're converting from 16-bit to 8-bit? If that's the case, I don't think it's correct. You need to use arithmetic to convert between the two rather than a mask. This is what I use in miniaudio:

for (i = 0; i < count; i += 1) { ma_int16 x = src_s16[i]; x = (ma_int16)(x >> 8); x = (ma_int16)(x + 128); dst_u8[i] = (ma_uint8)x; }

So to summarise, you want to check that you're handling the interleaving properly and that you fix your 16-bit to 8-bit conversion. Then we can proceed from there.

For your malloc vs static thing, you can implement custom allocation callbacks. Look for drflac_allocation_callbacks in dr_flac.h and you should be able to figure out how it works. You could implement a custom allocation function which draws it's memory from a custom statically allocated buffer. But otherwise, no, there's no way to change that.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/mackron/dr_libs/issues/182#issuecomment-793754660, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFQIYY5YRLHMSTA6OFZIC3TCYBQFANCNFSM4YJ3XKOQ .

mackron commented 3 years ago

You need to determine the sample format, channel count and sample rate that your hardware is expecting. If you don't know, you need to find out. That's the first thing you need to do because if the hardware specs are different to the FLAC file, nothing will work properly.

If the hardware is expecting something other than signed 16-bit samples you'll need to do a data conversion. If your hardware is using a different channel count, you'll need to do channel conversion. If your hardware is using a different sample rate, you'll need to do resampling.

So first step is to determine your hardware's sample format, channel count and sample rate and report back here. The same for your FLAC file. Once you have that information it'll be easier to figure out what's going on.

VincentGijsen commented 3 years ago

Hi David,

Thanks for you addition; so far its with WAV files [signed 16 pcm @48khz] works quite well, if i may say so myself. When i've made some more progress on meta/navigation, i'm sure to want to take another attempt on FLAC, but probably your mp3 library as well.

Should you be bored; https://github.com/VincentGijsen/CarIAPPlayer is where i'm working on.. its pretty messy though ;)

mackron commented 3 years ago

If dr_wav is working for you, then dr_flac should work exactly the same. Just call drflac_read_pcm_frames_s16() in exactly the same way you're calling drwav_read_pcm_frames_s16().