[FEATURE REQUEST] Configurable audio outputBuffer size

OpenSourceAnarchist commented 3 months ago

Is your feature request related to a problem? Please describe. For computationally expensive functions that run every tick, every noteOn, etc. (like I'm implementing) it would really be nice to have a configurable size to the "audio out" stereo buffers rather than the whole system running in real-time. This doesn't matter so much for offline audio rendering, but the library is internally processing the voices with single arrays of type Float32Array for the left/right audio buffers. This means that for intensive calculations or functions that take a while to return during real-time playback, it would allow for smooth playback if outputLeft/outputRight were Float32Array[], and by default the behavior is as it is currently where it would usually be a length of 1 with a single Float32Array.

Is your feature related to the sound library or the app/website? Library

Describe the solution you'd like In voice_control.js, renderVoice() would instead take in outputLeft {Float32Array[]} and outputRight {Float32Array[]}. panVoice() in stereo_panner.js would also need to be updated to handle this array of buffers. Any other outputBuffer use would have to be changed depending on whether you're looking at a single audio sample buffer or an array of them.

The developer could set the buffer size in seconds and it would adjust based on playback frequency, e.g. if you want a 6 second audio buffer it would be bufferSize = sampleRate * seconds

Describe alternatives you've considered None. A popular program I use, GXSCC, allows the user to configure the audio buffer in seconds because low-end hardware struggles to keep up in real-time. This would allow intensive computations or low-end hardware to benefit from having a configurable audio buffer by a developer/user (e.g. in the Demo this could be paired with high-performance mode, where high performance --> 1 sec audio buffer for example). In GXSCC, it is configurable from 0.2 seconds to 6 seconds. When I load a midi file, it takes 6 seconds before the song starts playing but this ensures no popping or lagging during playback!

Additional context The same could be true for the video buffer, but I have no idea how you're handling this or whether this would be possible to sync with the duration of the audio buffer. Essentially this just delays the start of playback to give the buffer time to fill and hopefully not empty...

spessasus commented 3 months ago

SpessaSynth is heavily integrated with Web Audio API which only allows real-time rendering or OfflineAudioContext. If that doesn't suit your problems, then I'm sorry, but there's no way to directly render audio out to a buffer.

The code you've described is executed in the Audio Worklet thread, which is inaccessible for the user.

Though, spessasynth_core might be of use to you. It provides a direct "render" method for writing the data into Float32Array.

spessasus commented 3 months ago

Though the OfflineAudioContext should work fine. It ensures no dropped frames of audio, no matter how weak the device is (I've managed to render a song on a super old 1GB of ram netbook with a single core Atom).

Edit: Reading your suggestion a few times, i still don't get it.

Render voice renders the voice to the left speaker and the right speaker, that makes two arrays. Why would you want N copies of the same array?

OpenSourceAnarchist commented 3 months ago

Technically what I'm asking for could still be implemented, albeit not the way I was thinking. I didn't know the Web Audio API was either real-time or completely offline (not a web dev usually haha, this is my first JS project). I also didn't know you had a separate npm version of the library since this version is also compatible with npm. The render() method in spessasynth_core is exactly what I was looking for!

If you wanted to add a configurable buffer support in this library, you could just have a temporary Float32Array[] for the outputBuffer or outputLeft/outputRight like I said above, and only after it is initially full does the player indicate it's fully ready and begins to pop Float32Arrays to renderVoice() like usual. The Web Audio API would still receive real-time data one buffer at a time, while there is internally a configurable buffer that merely sends the data to renderVoice() normally.

Although the alternative is for midi files that are too difficult with my tuning algorithm (too many simultaneous voices to solve for), I just render them "Offline", save the SysEx MTS calls in the file, then play it back real-time again without my algorithm running. Otherwise I'll probably have to try to fork the library or just wait for spessasynth_core to get MTS support.

Or... it gets ported to WebAssembly and I write the algorithm in C++ and I don't have to worry about performance (not really, but it would make it less of a concern) 😄

EDIT: The purpose is not to have N copies of the same array, but to store the next ~6 seconds of audio in a buffer for example. One buffer for each sample, where there are 44.1k samples per second.

spessasus commented 3 months ago

The purpose is not to have N copies of the same array, but to store the next ~6 seconds of audio in a buffer for example. One buffer for each sample, where there are 44.1k samples per second.

The buffer is always 128 samples long

spessasus commented 3 months ago

I also didn't know you had a separate npm version of the library since this version is also compatible with npm.

SpessaSynth it made for browsers while core is made for use with node.js (npm is just a package manager, like apt or pacman)

spessasus commented 3 months ago

Although the alternative is for midi files that are too difficult with my tuning algorithm (too many simultaneous voices to solve for), I just render them "Offline", save the SysEx MTS calls in the file, then play it back real-time again without my algorithm running. Otherwise I'll probably have to try to fork the library or just wait for spessasynth_core to get MTS support.

How about just going through the MIDI file, manually computing note times and adding sysEx calls? It can take as much time as it needs to and then simply send the modified midi to the sequencer.

Take look at this code it goes through the MIDI file and detects all note-on's used. You could edit it and make it insert needed sysExes instead?

spessasus commented 3 months ago

Since I can't implement the original feature here, I'm closing this as not planned.

OpenSourceAnarchist commented 3 months ago

Okay, then bufferSize = sampleRate / 128 * seconds. It shouldn't matter in the end. I'm glad Mozilla is aware of the issue and is aiming for it to be configurable. Ideally I would just set it to be 264,600 and it would accomplish the same thing as what I'm saying, but since that isn't possible a temporary buffer should enable the same use case.

I know npm is just a package manager, it's just that I saw npm support in this repo's wiki, not only the _core repo. Which makes sense since npm.js can run browser code. I'd like my library to work with both browsers and npm so I'm still hoping to use this library instead of _core.

Take look at this code it goes through the MIDI file and detects all note-on's used. You could edit it and make it insert needed sysExes instead?

Yes that was actually originally what I was going to do before I found your library. Though it would still be easier to use your midi parser and modify it to also calculate MTS SysEx messages, I really am trying to copy another software application and implement it into your Demo app code since it was also real-time. Though of course it also allowed you to configure the audio buffer...

Since you marked it as not planned, I'll just have to fork your library and add in a temporary buffer. It really doesn't seem difficult and would solve the problem completely. No worries! Maybe once you see what I mean you'll accept a PR :D

spessasus commented 3 months ago

Sure, feel free to fork it. Though I don't think it will be that easy since again, you can't output a list of float32arrays via the process() method (or change their size). But if you accomplish it, feel free to file a PR!

OpenSourceAnarchist commented 3 months ago

Well... I ended up finally implementing the dynamic tuning algorithm from C++ Eigen to ml-matrix in JS. In order to keep up with ~25 Hz at a minimum (arbitrary), if my algorithm is called every tick, I have 40 ms to perform the entire calculation. I did some benchmarking, and my implementation can reliably handle ~280 notes at a maximum within 40 ms. Rather than hack your library even further and work against the Web Audio API, I'm just going to set a reasonable voice_cap when my algorithm is in use and call it a day. I took a look at process() again and I see what you mean, it really wouldn't be that simple and I'd have to change many of the other functions just like in this lib version. However my algorithm should be easily portable to spessasynth_core too now that I'm avoiding a buffer :)

If you're open to it, I'd love to include the algorithm as a toggle in the Demo app. All it would do is register my algorithm as a function called on every tick (or every note on/off, haven't decided yet), and it would send MTS SysEx messages appropriately. The only reason I could see you not wanting it is that index.html would need to include <script src="https://cdn.jsdelivr.net/npm/ml-matrix@6.11.1/matrix.umd.js"></script>, or at least it would need to dynamically load it if a user would check to enable the algorithm. I'd be happy to do all the work and eventually import it to ss_core once MTS support is added.

The whole reason users would want this is to have more consonant pitches when playing a midi file or playing an instrument live with your synthetizer. See http://www.hermode.com/index_en.html and https://support.apple.com/guide/logicpro/hermode-tuning-lgcpa88a63e7/mac for a simpler dynamic tuning algorithm, but the reasoning is the same. Logic, Cubase, Capella, and many others implement this. The algorithm is taken directly from https://arxiv.org/abs/1706.04338 and would work in both offline and real-time contexts.

spessasus commented 3 months ago

First of all: congrats!

Second of all: since you've managed to code it in, this means that the MTS works. Please close #29 then like I asked you to.

Lastly: If you've coded it, can you at least provide a basic demo? You know, just create a repo and host it on gh pages so I can see it.

And while I don't think I'll add this for two reasons:

The web app is not a song editor, just a basic sound/visualizer with an option to mix the instruments and rearrange them
The performance is already pretty bad on mobile and it would be even worse since you've already had to limit voices, which is a big no-no (before you ask, you can't check for mobile in the worklet scope) I'll gladly add a link or mention to check your project out or something like that on the demo page if you want though

OpenSourceAnarchist commented 3 months ago

I implemented a simulation of it with random parsed MIDI data in JS. I did not integrate it with your library at all yet (hence why I said "or every note on/off, haven't decided yet"... I only have an abstract implementation to confirm JS would work at all). If I did I would have closed the issue and showed you a demo :)

But that's completely fine, a shout out would be more than enough. Your library makes it easy to write a cross-platform custom MIDI player. I'll just include spessasynth_lib and write a customized frontend based on your demo app, maybe a node.js version eventually too. I'll add a reply to the MTS issue when it's done so you can see it (and will also close it if I don't find any problems!!)

Oh, one thing I did want to ask about. Can you change offline_audio.js to optionally enable the event system on the created Synthetizer? I think this used to be an option and I'll need this to allow my algorithm to work "offline" and to easily create new midi files with the computed MTS SysEx messages baked in. This way I can have real-time performances and creating new midi/audio files use the same code :)

spessasus commented 3 months ago

Yes you can (you always could), but keep in mind that offline context renders audio as fast as possible which might be too fast for the main thread. But feel free to try :-)

OpenSourceAnarchist commented 3 months ago

I'm sorry, I didn't see it was in the examples folder. I thought it was part of the main library, I don't know why. Yes thanks haha And thank you for letting me know that! Wow I could easily see that being a weird bug driving me crazy, wondering why sometimes it renders OK and other times I'd get a weird audio file. I'll stick to an AudioContext(). I've learned so much about web dev, JS, linear algebra, and the MIDI spec in these past few days!!

spessasus / SpessaSynth

[FEATURE REQUEST] Configurable audio outputBuffer size #34