korginc / volcasample

volca sample SDK - a sample and sequence encoding library for volca sample.
http://korginc.github.io/volcasample/index.html
BSD 3-Clause "New" or "Revised" License
349 stars 36 forks source link

Compiling SYRO to WebAssembly (DataType_Sample_Compress corruption) #18

Open benwiley4000 opened 2 years ago

benwiley4000 commented 2 years ago

I should clarify - I did compile the library to WebAssembly, using Emscripten with crwap for consuming the library from JavaScript in a web browser. It works!

However if I convert a sample using compression (DataType_Sample_Compress), the final WAV file is a corrupted result that isn't the same as the correct result I get from processing the same sample file in Linux using a GCC-compiled version of the library. I tried using Emscripten with ASM.js output (instead of WASM) but the WAV file result was the same. There's no error message after conversion, but transferring the file to the Volca Sample triggers an error message on the device. At first it begins but once it reaches the compression blocks there is an error message.

If I use DataType_Sample_Liner the results between GCC/Emscripten are identical, so I think something about the korg_syro_comp algorithm does not translate correctly to WASM. I'm not a C expert, but I'm guessing maybe the algorithm relies on some undefined behavior that is handled differently by gcc vs emscripten? Or the program requests a temporary buffer larger than Emscripten wants to provide? I'm not really sure what to look for.

I did some visual analysis of the two compressed data WAV results in Audacity. I will also attach the files below.

The audio file I'm converting is the 02 Kick 3.wav file included in the Syro example code.

At a high level the result files look the same. They are the same length and have the same data organization structure:

Screenshot from 2021-07-23 15-22-07

When you zoom in, you can see that the buffered sin/triangle wav sections are identical, and the initial (short) data block is identical across the two files:

Screenshot from 2021-07-23 15-24-55

Moving to the compression blocks part of the file, you can see that at a high level the blocks look similar. Each block starts and stops at the same time across the two files:

Screenshot from 2021-07-23 15-26-57

However zooming in shows that the data between the two blocks is NOT the same.

Although the beginning 0.0025 seconds of each compressed section is identical:

Screenshot from 2021-07-23 15-31-14

We can see that all the data after this is completely different:

Screenshot from 2021-07-23 15-32-17

You can analyze these files yourselves here:

syro_test_result_files.zip

benwiley4000 commented 2 years ago

Turns out this error was due to an out-of-bounds stack buffer access that wasn't caught by the compiler. It's not guaranteed to cause issues, but with Emscripten it seems to cause issues consistently. PR #19 addresses the problem.