EmuSC as a library - Githubissues

dwhinham commented 2 years ago

Hi there! :slightly_smiling_face:

First off, great idea for a project! I'm in regular contact with Kitrinx (who as you know worked on the ROM reverse-engineering with NRS) and there have been discussions between us about using this knowledge to try and create a new SC-55 synth (the idea of extending Munt or hacking unofficial extensions into the SoundFont spec for FluidSynth were bounced around), but we have been too distracted with other things to get started, so it was great to come across this repo.

I've been working on a project called mt32-pi which is a baremetal Raspberry Pi-based MT-32/SoundFont synth based on Munt and FluidSynth. FluidSynth is obviously great at what it does, but the SC-55 has more advanced envelopes and other features that aren't possible to do within the SoundFont spec, so it is still very desirable to get a better SC-55 emulation into the project.

I appreciate it's very early days for EmuSC, but I've already had a look at the code to see what it would take to get it ported to baremetal.

I'd love to see the core synth engine decoupled from any front-end, application, or file IO stuff so that it can be integrated into other applications more easily - in other words, it'd be great to be able to compile a "libemusc.a" and just have a bunch of headers that give me an API to initialize the synth, set some config options, hand it some MIDI data, and pull out some audio samples. It's already pretty close to being this way, but I'd have to replace the Config class with something that doesn't throw exceptions or write default config files to disk. Being able to handle ROM loading manually would be useful, e.g. initialize Control/PCM ROM objects with my own file paths (or even byte arrays that I've filled myself), and then pass those to the Synth object.

It'd also be great to see everything namespaced under EmuSC:: to help keep external code separate from integrators' code.

Munt's API design is quite friendly for integrating it as a library, so maybe it could serve as some inspiration for EmuSC's design?

Anyway - just some ideas from the perspective of someone who'd be interested in integrating this work as it matures - again, I appreciate it's early days and very much work-in-progress. :slightly_smiling_face:

Cheers, and keep up the great work!

skjelten commented 2 years ago

Hi Dale,

I'm happy to hear that you found this project of interest! It was actually during this Christmas holiday when I was working on my own custom mt32-pi project that I started to search for a free software SC-55 emulator. I ended up reading the entire vogons thread started by mattw and got inspired by all the effort they had put into decoding the ROM files.

Since your mt32-pi project was the reason I started coding EmuSC in the first place, I am of course very interested in making the necessary changes for it to be a part of that project. Looking at the munt source I can see how they have placed the core synth in a library - and added their own namespace. Seems like a reasonable approach, likewise with the changes you mention on file IO and the Config class. So thumbs up to all your suggestions.

That said, this project is, as you mentioned, still in an early development phase, so I would like to first improve the core synth to a level where it produce sounds that does not make your head hurt. At that time I will be much more confident that this project will succeed in its goal and that you will not be wasting time with the integration.

PS: I will of course never refuse patches or request for project membership if you would like to do any coding yourself 🙂

mmontag commented 2 years ago

Great to see this project! I've been waiting for something like this. Roland should do it themselves, honestly. But here we are.

I'd love to see the core synth engine decoupled from any front-end, application, or file IO stuff so that it can be integrated into other applications more easily

I will second this, and also emphasize that audio output (and MIDI sequencer code, although this is debatable) should be outside of the library. I like the libADLMIDI interface, for example: https://github.com/Wohlstand/libADLMIDI/blob/master/include/adlmidi.h#L1072

ThatRetroGuy commented 2 years ago

if you know somebody who can help on this project with getting it to a similar result. would be a big help on it, he's mostly the person focusing on it at the moment.

skjelten commented 2 years ago

Please have a look at the latest commit. EmuSC is now split into a library and a frontend application.

I was a bit uncertain on how to design the library interface. After a lot of back and forth I looked at Munt to see what they did.. just to find out that they implement 4(!) different APIs. So in the end I ended up with making 3 classes the public API: ControlRom, PcmRom and Synth.

The basic flow is like this:

Create a ControlRom object
Create a PcmRom object
Create a Synth object
Set up some callback function to run Synth->get_next_sample() when the audio buffer is running low
Send raw MIDI events to Synth->midi_input()

Please have a look and tell me if the interface is OK for integration with e.g. mt32-pi (and yes, it is my intention to write some documentation later :slightly_smiling_face:)

dwhinham commented 1 year ago

Hi there, sorry for the late reply - I have been so busy lately.

Thank you for doing this separation work, it does indeed make it much easier for integration into mt32-pi.

I've already got some test code working as per your instructions and EmuSC is making sound from bare metal, albeit it quickly suffers from performance issues when more than one or two notes is playing. But it's a great start!

Some comments:

I managed to convince the Automake build system to do what I needed (cross-compile a static library with the ARM toolchain) but I would love to see CMake used instead; it can help take a lot of pain out of maintaining (and distributing) a cross-platform project, even on e.g. Windows. If you're committed to sticking with Automake, then no big deal, otherwise I'd be happy to contribute a CMake build system if you like.
std::mutex isn't available from my bare metal environment, so I have to comment out the relevant parts of synth.h/synth.cc (the #include statement, midiMutex class member, calls to midiMutex.lock()/midiMutex.lock() etc). Instead, I use spinlocks to synchronise MIDI/audio threads from the call site (the same is done for Munt/FluidSynth). This isn't a big deal, but being able to #ifdef out any threading/mutex-related stuff via the build system would save me the hassle of having to patch EmuSC.
The use of std::vector (and other STL containers that use heap memory allocation) in many places is causing issues because the heap allocator in the baremetal environment is very limited and doesn't cope well with thousands of small allocations. I am able to somewhat work around this with a custom memory allocator (same technique was required for FluidSynth), but it's difficult and I'm still fighting allocator-related performance issues and memory fragmentation. As a general suggestion I would urge you to try and allocate fixed-size buffers up-front, and avoid resizable containers at all costs, especially in the "hot" areas of code that are involved in audio rendering, as it can cause performance to tank, especially on systems with limited CPU power. I will try to provide specific examples at some point; am still trying to work out where the main issues are as it's difficult to debug bare metal, but I even had memory related issues with the ROM loading/parsing code.
It is probably worth trying to make use of lookup tables for some good performance gains. For example, when profiling I found thousands of calls to pow() via NotePartial::_convert_volume(). Seeing as the input is a uint8_t, there can only be 256 possible output values, and so with C++14 you could do something like:
```
double NotePartial::_convert_volume(uint8_t volume)
{
  struct VolumeLookupTable
  {
      constexpr VolumeLookupTable() : data()
      {
          for (uint8_t i = 0; i < UINT8_MAX; ++i)
              data[i] = (0.1 * std::pow(2.0, (double)(i) / 36.7111) - 0.1);
      }

      constexpr double operator[](uint8_t i) const { return data[i]; }

      double data[UINT8_MAX];
  };

  constexpr auto volume_lut = VolumeLookupTable();
  return volume_lut[volume];
}
```
This generates a compile-time lookup table. Again this requires C++14; I think this can be made shorter with C++17, although if you wanted to stick to C++11 it'd be simple enough to just generate an array with a script and hard-code the values.

I think this issue can probably be closed as you have achieved the library separation, and if you're interested in discussing any of the feedback above we can make some separate issues if you like.

Cheers!

skjelten commented 1 year ago

Great to hear that you got it to work!

Some feedback to your comments:

I was considering to move to CMake when I started the separation work, but decided not to due to the high number of other changes going on at the time. I plan on moving to CMake later (partly because I want to learn how it works :slightly_smiling_face:) and it would be great if you could do a code review before committing.
Feel free to send a patch for the mutex part. Ifdef's are not the most elegant code style, but I have used them quite extensively for midi and audio configuration and it is of course not a problem to add for threading as well.
When programming C++ I tend to always use std::containers, such as std::vector, simply because that is "the intended way in C++". That being said, I do not think it would be much effort to move to static arrays in most places. If you find the most problematic instances we can convert them to static arrays first. I will keep it in mind for future code though.
I actually implemented support for lookup tables in the ControlRom class last week (lookup tables located in the control ROM that is). Kitrinx found them earlier this year and they probably cover both TVA volume (as you noted above) and TVA duration in addition to many other areas we have not identified yet. But in the meantime, before we manage to use these ROM tables, your suggestion about generating them absolutely makes sense. I went for C++11 only because it was the oldest version that supported std::threads, and we may very well aim at a newer version.

I will just close this issue as you suggested, and let us just create new issues for each topic forward.

dwhinham commented 1 year ago

I was considering to move to CMake when I started the separation work ... it would be great if you could do a code review before committing.

Absolutely, no problem at all. I'm no CMake expert but I've used it successfully on several small to medium-sized projects (e.g. https://github.com/milkytracker/MilkyTracker, which targets Windows, Linux and macOS).

Feel free to send a patch for the mutex part. Ifdef's are not the most elegant code style, but I have used them quite extensively for midi and audio configuration and it is of course not a problem to add for threading as well.

Will have a think about this - I agree on the #ifdefs. Some C libraries define high level macros that implement initialization/lock/unlock, which might be nicer and allow more flexible overriding. For example, this is how FluidSynth does it.

When programming C++ I tend to always use std::containers, such as std::vector, simply because that is "the intended way in C++". That being said, I do not think it would be much effort to move to static arrays in most places. If you find the most problematic instances we can convert them to static arrays first. I will keep it in mind for future code though.

The use of the std:: containers isn't too much of a problem in itself and certainly makes the code simpler, but it does make it extremely difficult (or even impossible) to override new/delete for individual libraries when custom memory allocation is required, and this is an example of why e.g. the games industry often avoids them.

Still, things can be improved dramatically with a few calls to std::vector::reserve() - I was able to reduce the number of allocations due to vector resizing from around 13,000 to only 3,000 by reserving vector sizes up-front in PcmRom (will prepare some PRs soon). This eliminates a lot of memory fragmentation issues in my use case. There are also some places where std::array would make more sense when the data is constant and the size is known.

It may be worth thinking about the approach (maybe when the ROM data is fully understood): the way Munt handles ROM data is is to maintain a list of hashes, offsets and table sizes for compatible ROMs and their useful data. That way parsing could be avoided, allocations could be done upfront, and the raw ROM data could just be cast into struct pointers. I'm aware there's some decryption that you have to perform for SC-55 though, so that may complicate things.

I actually implemented support for lookup tables in the ControlRom class last week (lookup tables located in the control ROM that is). Kitrinx found them earlier this year and they probably cover both TVA volume (as you noted above) and TVA duration in addition to many other areas we have not identified yet.

That's great - nice work! I will update to the latest code; I was evaluating a commit before you added the lookup tables.

Thanks again!

skjelten commented 1 year ago

Hi there, a final update: I have just committed a change to CMake. Feel free to test and propose any improvements!

skjelten commented 2 weeks ago

@dwhinham Another "final" update on this topic: I just tested latest git of emusc and it is able to play up to ~4 simultaneous notes before reaching the CPU limit on a raspberry pi 3 B+ (including low pass filtering, reverb effects etc.). This was however done as a normal QT application in a regular Linux install, and not as a native standalone application like you do in the mt32-pi project.

I don't know if you are still actively working on the mt32-pi project, and the C++ API of emusc might be a real challenge to work with as it has been pretty much shaped to fit the QT-application, and emusc is still not at a level where it sounds as good as the real hardware, or nuked-sc55 for that matter, but I am pretty sure we can make it run on raspberry pis if we want to. The model 4 could perhaps work pretty ok already.

Just give me a heads up if you ever want to try emusc again, and I'll put some extra effort in getting the CPU usage further down so that it (hopefully) will work on models >= 3.

skjelten / emusc

EmuSC as a library #1