Make HEELP core reliable on Windows

gbevin commented 7 years ago

The current core is very reliable on macOS, you can switch to different audio devices and configurations in a flexible manner and everything works well.

On Windows something is not right, the first startup seems stable and reliable, but switching to different audio devices and configurations afterwards, doesn't hold up even now that the whole audio infrastructure is recreated.

gbevin commented 7 years ago

@NothanUmber @TheTechnobear hey guys, if you have any cycles to look at the current HEELP state on Windows, that would be very helpful. I've got too little Windows development experience to really know where to direct the search to. It would be great to have the core reliable on both macOS and Windows before starting to make things more complex.

NothanUmber commented 7 years ago

@gbevin Will give it a try in the evening. Also not really an expert, the only lower-level Win32 stuff I did was years ago on WinCE - and that differs substantially in detail from the "big" Windows (e.g. the shared memory approach of WinCE was very unique - you could e.g. even directly exchange pointers between processes (as long as they pointed to SM) as the SM was mapped to the same (reserved) addresses in all processes :) )

gbevin commented 7 years ago

Thanks @NothanUmber ! :-)

NothanUmber commented 7 years ago

Got rid of the insta-crash when switching audio devices: https://github.com/NothanUmber/HEELP/commit/50c2c2904be83ff61f7134a2f9e2570e810fdeed

Still at least two issues remaining: 1) when increasing the audio buffer I still get an invalid memory access in ChildAudioComponent::getNextAudioBlock - the memcopy already uses the bigger buffer size but the buffer is still too small (can be prevented when initializing the localbuffer with a big, fixed size (temporary local hack) - but the method shouldn't run when the buffer is not reinitialized yet in the first place, so better find the root cause) 2) Still getting a deadlock in juce_win32_ASIO.cpp line 678 when stopping the ASIO device (when switching from ASIO4All to my USB audiocard): callbackLock is still hold by the asio4all64 thread that is rollercoasting in the loop in MainAudioComponent::PImpl::getNextAudioBlock, line 227 (waiting that the SM blocks get visible, which never happens)

gbevin commented 7 years ago

@NothanUmber thanks!!! I don't get an instacrash when switching audio devices, so it's great that you found this. I'm wondering, would it have the same effect if you merely do in ChildAudioComponent::Pimpl::releaseResources():

    audioSourcePlayer_.setSource(nullptr);
    deviceManager_.removeAudioCallback(&audioSourcePlayer_);

This would remove the need for an additional lock and still leave the memory deallocation for the constructor?

gbevin commented 7 years ago

@NothanUmber I committed some changes that might fix what you have in your commit but without any additional locks, does it fix the instacrash for you?

gbevin commented 7 years ago

@NothanUmber I completely removed the local buffer in ChildAudioComponent, which should solve that first issue you still had. No idea about that deadlock though, I'll try to find a proper Windows audio interface device to test this with, I can't get ASIO to work at all for my internal sound card.

gbevin commented 7 years ago

@NothanUmber @TheTechnobear sadly I'm suspecting that ASIO doesn't actually allow multiple processes to use it and that the first one that grabs it has exclusive access to it. This would mean that the multi-process approach that times off of the audio callback thread can't work on Windows :-/

TheTechnobear commented 7 years ago

a quick search , indicates exclusive mode can be disabled, but its introduces latency. I think your approach still works though, it just means that you will get one buffer of latency under windows BUT you still gain the advantage of 'sandboxing' id have thought, it might be possible to structure the code, such that you could allow channels to be either in-process (separate RT thread) or out-of-process, allowing the user to configure. I think this is very useful , as for a 'small setup' a one process with N RT threads, may be preferable (better performance) if you don't want/need the sandboxing. similarly it would be useful, if a channel could be executed in the 'main RT thread', i.e. non parallel processing. (this makes sense when you have very few cores)

gbevin commented 7 years ago

@TheTechnobear my searches indicate that only very few audio interfaces support ASIO in multi-client mode (RME) and that for others you would have to install a dedicated multi client server for that from Steinberg, which seems not really maintained.

I would like to avoid one additional buffer of latency for the purposes of this application since it's quite contrary to what you need for live performance. Note that there's already one buffer minimum for the audio generation, another one for the MIDI data collection, and this would be a third. So the multi-process approach really only was acceptable to me if just the busses would require that additional buffer and the dry signals could be output immediately.

I think HEELP will have to have a second mode with only RT threads and no sandboxing, which will be the only one available on Windows. As an upside, this might make it possible to not have an additional buffer of latency on the busses. Onwards with the multi RT threading approach then ... :-/

@TheTechnobear have you had a chance to look into the integration of the Eigenharp USB code?

NothanUmber commented 7 years ago

@gbevin @TheTechnobear Yepp, the release date of that multi driver thing doesn't imply intense testing with Windows 10 :) ftp://ftp.steinberg.net/Download/Hardware/ASIO_multiclient_driver/

Geert, your change indeed resolves the "insta-crash". Now I am far enough to say that I also don't get audio with ASIO anymore. (Anymore: ASIO actually worked with the very first version of HEELP I tested)

Perhaps it is better anyways if a channel isn't forced to run in an own process on Windows (afaik for older versions of Windows process context switches were by factors more expensive than mere thread context switches. Not sure whether Win10 got better in that regard...). So running 16 channels on 4 cores would require a lot of process switching (and thus cache flushing). Optimally we would only need as many threads as we have cores. (So several channels might share the same thread or not, depending on which hardware the setup is executed)

TheTechnobear commented 7 years ago

indeed, I think in the code, there should be concept of channels and components (within channels) and these should be given a context to run from... this might be a 'remote process' context, or a thread context... i.e. the channel doesn't know/care if its communicating with the main process (container) via shared memory, or real memory.... this is an 'implementation detail', and is initialised based on configuration.

Ive got guests until Friday, so wont have time till then... after that, I can bring in the Eigenharp code, though I was hoping there would be some elements of the component architecture in place to plug it into, so I can 'test it as I go along... (as above, I think it should just be a component, that feeds into the midi stream... note: later we may want to add into the audio stream) also, Im still a bit unsure of the 'jucer makefile' having this all as one big executable block seems very inflexible, but as soon as we want to start build so's/dll etc, then something like cmake seems more appropriate. anyway, will see where we stand on at the weekend, then take a fresh look (this should probably be then in a new issue ;))

NothanUmber commented 7 years ago

Found a commercial multi-client capable ASIO driver: http://odeus-audio.com.au/Odeus/AsioLink Tested the demo version - also no sound. (my RME Babyface should also be multi-client capable though, so probably the driver isn't the (only) problem anyways - perhaps it would have to be initialized differently than Juce does it). But anyways, something that either most drivers don't support (or you have to buy a >50$ extra program to be able to use) would probably limit the audience too much.

Also experimented with already setting up the audio manager for the target device in the first place (instead of disabling the default and then switching). Subclassed AudioDeviceManager for that and overwrote createAudioDeviceTypes to enable/disable different device types. Only got WASAPI in non exclusive mode and Direct Sound to run (which are afaik both not meant for low latency audio applications). Unfortunately Jack Audio is not supported yet for Windows (by Juce) natively and the ASIO JackRouter is just crashing in libjack64.dll).

So no sandboxing it is :)

NothanUmber commented 7 years ago

Tried to port the Linux JackAudio Juce bindings to Windows (essentially I just replaced dlopen with LoadLibrary and dlsym with GetProcAddress - and a few ifdefs here and there). Status: Crashes in libjack64.dll, like with the ASIO JackRouter when using the 64 bit build. With a 32 bit build it loads but the list of output devices is reported as empty :/ After some investigation I found this statement: "Currently Jack for Windows only supports 32-bit applications, but a new version is currently being tested that supports both 32-bit and 64-bit audio applications. If you are working with 64-bit applications then contact the Jack Developers list for more information."

Tried to understand how Carla works with JackAudio/ASIO on Windows (it also uses Juce) - but occurrently Juce is only used for UI there and they use RtAudio for the audio side... http://kxstudio.linuxaudio.org/Applications:Carla https://github.com/falkTX/Carla/ https://github.com/thestk/rtaudio

gbevin commented 7 years ago

Yeah, I would really not want to rely on Jack at all, it has been very flaky over the years and many that have tried to rely on it have been burned and abandoned it.

gbevin commented 7 years ago

I'm going to start working on the multi RT thread version tonight or tomorrow with a way to switch between either. It's a good thing that almost everything for the multi process stuff is already in dedicated classes.

gbevin / HEELP

Make HEELP core reliable on Windows #2