Closed jprjr closed 10 months ago
Sorry for the late reply.
Thanks for catching the issues with the 24-bit stuff!
There are a few things I'd like to have changed though:
SampleConv_toS24
and do Little Endian/Big Endian via a #define
? You do that in the ALSA driver anyway.SampleConv_toS24
(LE/BE), you create an array UINT8 tmp[3]
and copy that into buffer
later. Can you just assign UINT8* tmp = buffer
and write directly to that?ReadLE16
and ReadLE32
in the DRO/GYM/S98/VGM code as well.TBH I'm a bit split about the memcpy stuff.
I see on godbolt that GCC/Clang generally handle these in a very performant way with any optimization level.
However, MSVC 2015..2019 requires at least /O2
to generate decent assembly code.
It calls the external memcpy
function with other optimization levels, causing a heavy performance drop in Debug mode. But one could argue that MSVC is know for that anyway.
I tested this against MS VC 6.0 (which I still use for 32-bit VGMPlay), and it behaves like later MSVC versions, so I guess this should be fine though.
MS Visual C++ 6.0 test:
memcpy-test.c
#include <string.h>
extern int arr[];
void myfn() { int a = 2; int b = 3;
arr[3] = a;
memcpy(arr, &b, sizeof(int));
}
- compile using `cl /O2 /Famemcpy-test.asm /c memcpy-test.c`
- `memcpy-test.asm`
```asm
TITLE memcpy-test.c
.386P
; [omitted segment definitions]
PUBLIC _myfn
EXTRN _arr:BYTE
; COMDAT _myfn
_TEXT SEGMENT
_myfn PROC NEAR ; COMDAT
; File memcpy-test.c
; Line 10
mov DWORD PTR _arr+12, 2
; Line 11
mov DWORD PTR _arr, 3
; Line 12
ret 0
_myfn ENDP
_TEXT ENDS
END
Regarding keeping it a single function and conditionally-compiling code - with the ALSA driver I have the benefit of knowing it will be compiled on Linux - alsa/asoundlib.h
includes endian.h
-- which is a platform-specific header. It's not defined by POSIX, and even between Unix-ish platforms it differs in what macros it defines.
I can try getting together a list of macros to determine endianness and do the same in PlayerA. Main concern though is that would bring platform-specific macros into code that (as far as I can tell) is platform-independent. That's why I went with the two-function + runtime check.
I wouldn't mind to just have a global #define
that controls Endianess.
Apparently you can do this easily with CMake's built-in scripts:
If you want, I can do the CMake integration and leave the code fixes to you.
That would be great!
I pushed a commit 5708a21390ed955059efff8dd501644c435dd5e2 that makes CMake define the VGM_LITTLE_ENDIAN
and VGM_BIG_ENDIAN
variables.
I also implemented the ALSA fixes in 02bc8a64bfec599793c5d85bc9b5149e41dabe8f and gave you credit.
Do you intend to touch the pointer-cast/memory stuff again at some point?
Sorry about that - pushed up a new version that uses the defines
Thanks.
Out of curiosity - what is the reason you avoid the packed INT32 data : 24;
so much?
Most properties of bitfields are implementation-defined so I generally avoid them, like most undefined behavior/properties. The only time I ever use them is when I need to sign extend from a fixed number of bits.
Okay, that makes sense.
Fixes #108 , also obsoletes #109
Replacing the direct assigned-to-pointer-casts with memcpy calls avoids potential undefined behavior (like if the sample buffer wasn't allocated on the correct byte boundary). On systems where the alignment is a non-issue, compilers should replace memcpy calls with the equivalent of
*(INT16*)buffer = (INT16)value;
etc.