ValleyBell / libvgm

A more modular rewrite of most components from VGMPlay. will include sub-libraries for audio output, sound emulation and VGM playback
irc://irc.digibase.ca/#vgmrips
136 stars 33 forks source link

PlayerA: 24bps can cause misalignment errors #108

Closed jprjr closed 10 months ago

jprjr commented 1 year ago

Hi there!

I compiled libvgm with undefined behavior sanitizers and learned the current implementation for rendering 24bps audio can cause alignment errors. Here's an except from running with vgm2wav set to 24bps:

libvgm/player/playera.cpp:45:27: runtime error: member access within misaligned address 0x555597374ed3 for type 'struct int24_s', which requires 4 byte alignment
0x555597374ed3: note: pointer points here
 00  3a e0 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00

It would require updating a few things, but it may be worth having PlayerA render 24-bit samples as 24-in-32 (basically, the buffer pointer would be an int32_t *). As far as I can tell, all the audio backends in the vgm-audio library should support that. It's also a good deal easier to deal with as a library user - I can just access it as an int32_t and re-pack as needed.

Sidenote: if you set the ALSA backend to 24-bit, it sets the audio format to SND_PCM_FORMAT_S24 which is the mentioned "24-bits in a 32-bit type" format. But setting the Pulse backend to 24 bits uses the PA_SAMPLE_S24NE format which is the packed. Meaning if you modified player.cpp to use 24bps, it would work with Pulse but produce garbage with ALSA. So there's a mis-match there.

Plus - I'm not sure how else to pack audio as a 24-bit machine-native type without resorting to detecting the endianness at runtime. But packing as a 24-in-32 is just clamping the value to the 32-bit range and copying.

Happy to help code the change to adopt 24bps as the 24-in-32 format, both in the PlayerA class as well as the audio backends. Let me know!

jprjr commented 1 year ago

Small update, starting looking into this, it might make more sense to standardized 24bps audio to the 3-byte format across the board. It seems like the underlying audio libraries should also handle that. Plus since the PlayerA class has to choose a packing function at runtime anyway, adding an endianness check isn't a large impact.

Going with the 24-in-32 approach would require adding a lot of exceptions (ie, checking if the block alignment is divisible by 3 and padding it out to 4 in the ALSA audio driver).