ValleyBell / libvgm

A more modular rewrite of most components from VGMPlay. will include sub-libraries for audio output, sound emulation and VGM playback
irc://irc.digibase.ca/#vgmrips
137 stars 33 forks source link

Stack overflow when loading large VGM file? #125

Closed mmontag closed 2 weeks ago

mmontag commented 3 weeks ago

Is it possible that libvgm puts a large amount of data on the stack at some point during file load, or do I have some other issue? I am getting stack overflow errors in my Emscripten build when I load large VGM files (e.g., 8.86 MB underwater rampart md.vgm).

I'm loading files like so (source):

UINT8 lvgm_load_data(lvgm_player *player, const UINT8 *data, const UINT32 size) {
  PlayerA *playerA = real(player);
  UINT8 retVal;

  DATA_LOADER *dLoad = MemoryLoader_Init(data, size);
  DataLoader_Load(dLoad);
  playerA->LoadFile(dLoad);
  ...
}

Here is the STACK_OVERFLOW_CHECK output from my debug build with a 5MB stack size:

Aborted(stack overflow (Attempt to set SP to 0x001fd3d0, with stack limits [0x005da940 - 0x00ada940]). If you require more stack space build with -sSTACK_SIZE=<bytes>)

kode54 commented 3 weeks ago

I ran into this sort of error in Cog with different libraries. Apparently it's necessary to really pay attention to the use of local arrays for literally anything more than a few bytes, as larage arrays will fill up the stack pretty quickly.

ValleyBell commented 3 weeks ago

I don't see any large local arrays there, so I have no idea where the large stack usage could come from. All arrays are either module-global or part of a class. And objects are always created on the heap, unless I missed something.

The largest structures that land on the stack should be CHIP_DEVICE and SN76496_CFG. And even then there is usually not more than one of these on the stack.

kode54 commented 3 weeks ago

No locally created classes?

ValleyBell commented 3 weeks ago

The only locally created classes I can find are:


Of course I'm sort of poking around in the dark here. A stack trace could help figuring out the exact spot where the crash happens and possibly give a hint to all the things that are stored there.

kode54 commented 3 weeks ago

I wish you luck at finding it.

mmontag commented 2 weeks ago

Sorry, false alarm. For future reference:

The pattern I was using to invoke C functions from JavaScript was effectively passing all VGM file bytes on the stack:

const err = Module.ccall(
  'lvgm_load_data', 'number',
  ['number', 'array', 'number'],
  [vgmCtxPtr, data, data.byteLength]
);

The correct way to pass large data to Emscripten compiled code is to copy it to the heap first. Specifying 'array' for the 2nd argument type is wrong. It should be a pointer ('number'):

const dataPtr = Module._malloc(data.byteLength);
Module.HEAPU8.set(new UInt8Array(data), dataPtr);
const err = Module.ccall(
  'lvgm_load_data', 'number',
  ['number', 'number', 'number'],
  [vgmCtxPtr, dataPtr, data.byteLength]
);

See https://emscripten.org/docs/porting/connecting_cpp_and_javascript/Interacting-with-code.html#access-memory-from-javascript, https://github.com/emscripten-core/emscripten/issues/6860#issuecomment-405818401.