Open nyanpasu64 opened 2 years ago
Let me first say that the no-exceptions approach is new and I would not be surprised if it needs to be tweaked a bit. So, I'm happy to take suggestions or PRs that fix identified problems. I would also mention that I'm working on a big update regarding device enumeration (in the newdeviceselection branch), so that may change some behaviour too. But regarding what is described in this post, I don't immediately see what the problem is. It is true that the callback thread for the ALSA API is started near the end of the RtApiAlsa::probeDeviceOpen() and then conditionally waits until the stream is started, but that does not block other RtAudio functions from being called. If an error occurs when startStream() is called, the error() function (and if set, the error callback) will be invoked.
The issue I have is that there's no longer a clean way to call openStream()
, and receive error strings for only errors originating from the main thread before openStream()
returns. Calling setErrorCallback()
lets you supply a function which handles errors... but gets called both from the main thread (for openStream()
setup errors) and the audio thread (for audio loop errors).
Or is it important for graphical applications to handle both kinds of error, so you don't want to provide a convenient way to only handle setup errors?
EDIT: Possibly related to picking a device: https://github.com/thestk/rtaudio/issues/194
Perhaps I'm missing something but if openStream fails, it will return an RtAudioErrorType, so it would be clear there was a problem in that specific function (and if openStream fails, no audio thread would be created). I agree it might be an issue to get the corresponding error text for a specific problem, though it should be the one passed to the error callback with that particular error type. If this is the problematic issue for you, maybe an extra function such as getLastErrorMessage() could be added?
There are two high-level approaches to error handling: handling UI thread errors but ignoring audio thread errors (ideally they'd just stop the stream rather than terminating the app), and handling both UI thread errors and audio thread errors in the same or different ways. Currently this can be done by adding an error callback in both cases, and checking if the current thread is different from the thread calling openStream()
, and either ignoring audio-thread errors or sending them to the UI thread over a queue or such.
There are two primary cases I feel aren't well-served right now: handling UI thread errors without a callback but ignoring audio thread errors, and handling UI thread errors without a callback but handling audio thread errors with a callback (you don't have much choice). If you add getLastErrorMessage()
, then the first case can be handled by not adding a callback. But the second case requires adding a callback for audio thread errors, which also gets invoked for openStream()
UI thread errors, so I have to either make the callback silently drop them (since they'll be handled when openStream()
returns) or move my error handling logic into the callback (which works but I was trying to avoid it in the first place).
One property I want the API to have, is that when I add or remove audio thread error handling from an app, I don't have to change or move the UI thread error handling code. One way to do this is to separate audio thread error callbacks from UI thread error handling (and this is the API I want at the moment). For UI thread error handling (and synchronous code in general), I still prefer return codes and error strings over callbacks, since callbacks are indirect control flow, which is complicated to write and understand.
Also, can an unlucky user cause RtAudio to write to errorText_
and call error()
on both the audio and main thread at the same time, causing a data race? If so, that's a good reason to entirely separate the audio thread error callback system from the GUI thread error status (or callback) system, and make them build up errors in different strings and call different functions to read from the string.
(Why is errorText_
a field, rather than a stack-constructed std::string
passed by (const?) reference into error()
? Is the intent to reuse allocations across errors (though errors should happen so infrequently that the speed doesn't matter)? You could remove errorText_
altogether, and use stack-allocated string literals or std::string instead. This resolves the data race on errorText_
, but not on firstErrorOccurred_
and the other fields accessed by RtApi :: error()
.)
If you split the audio and UI thread error functions, the tricky part is making sure you only call the audio thread error function on the audio thread, and the UI thread error function on the UI thread.
Personally I like to ensure this by creating a separate object for audio thread state (eg. the ALSA/Pulse handle, the current audio error string, plus a pointer to shared state), object for UI thread state (not sure what, plus a pointer to shared state), and struct for shared state (eg. an atomic variable holding playback state, variables to signal the audio thread to pause/resume/stop playback, probably put mutexes and condvars here too). The audio thread object is constructed on the UI thread, but sent to the audio thread and no longer referenced by the UI thread, and the two objects only communicate through the shared state struct. Then you put methods on the audio and UI thread objects, so each object's methods only has access to the state belonging to that thread plus shared state, and accesses to shared state are syntactically distinguished since it's behind a pointer.
This is the approach Rust encourages for threading, and it makes it clear what methods run on each thread and what state each thread accesses, by baking them into the type system and explicitly in the source code (instead of relying on the programmers understanding the code and not making mistakes, which is rarely the case). However this isn't standard C++ practice (I don't care, I prefer correct-by-design to standard practice) and C++ isn't the best at mutexes (see my blog post), and it's a lot of work to retrofit existing C++ code into this style (as a consolation, it's good at exposing data races to be fixed), and requires intimate code knowledge I don't have yet (since I didn't design RtAudio).
creating a separate object for audio thread state (eg. the ALSA/Pulse handle, the current audio error string, plus a pointer to shared state), object for UI thread state (not sure what, plus a pointer to shared state), and struct for shared state (eg. an atomic variable holding playback state, variables to signal the audio thread to pause/resume/stop playback, probably put mutexes and condvars here too)
Will you accept a pull request to split RtApi
into separate types accessed by the UI and audio threads? For example, the audio thread operates on an expanded CallbackInfo
subclass, whose object
only contains atomic and mutex-guarded data rather than an entire RtApi
subclass? And the audio thread methods (like RtApiAlsa :: callbackEvent()
) will be moved to either static functions, or CallbackInfo{Alsa,etc.}
methods?
I looked into porting, and unfortunately it seems to require a good deal of effort, and may conflict with the in-progress changes at https://github.com/thestk/rtaudio/tree/newdeviceselection.
Also I only have a Windows and Linux machine to build and test on, so I can't port Core Audio at the moment, and I don't use OSS on Linux, nor ASIO on Windows (unless I perhaps install ASIO4ALL?). I don't know if this approach allows me to incrementally port code to the new "thread-safe by design through static typing" architecture.
I spent a day or two sorting through RtAudio's ALSA backend. I see a bunch of data races (errorCallback_
, errorText_
, RtApiStream::state
). Additionally I think probeDeviceOpen()
shouldn't spawn a thread, but instead we spawn a thread after opening a device (by calling a separate virtual method in RtApi). This makes it easier to prove that probeDeviceOpen() doesn't data-race writing to RtApiStream.
Unfortunately RtAudio's threading is difficult to understand, with RtApiStream::state
being accessed both with and without a mutex held, etc. The design is quite messy right now (RtApiStream
, CallbackInfo
, and AlsaHandle
fields being read and written by the main and audio threads, with and without mutexes held), to the point I don't know how to clean it up.
In comparison, cpal (a Rust library)'s multithreading (link) is beautiful. Every type is either owned by the main thread (Stream
), owned by the audio thread (the call stack of output_stream_worker
), or explicitly shared between the threads (StreamInner
). However to my knowledge cpal doesn't currently support duplex streams with synchronous input and output buffers. (I haven't done written an app that records audio, but this would be a large issue if I did.) I would use cpal as a reference if I wanted to redesign RtAudio's multithreading.
Am I actually qualified to rewrite RtAudio multithreading? I'd have to change data structures, and I can't port or test Core Audio or OSS (and perhaps ASIO too).
I also also want to add a mode where RtAudio tracks the default device as it changes. This is unrelated to threading, and I could work on that while leaving the data races in place. But I feel uncomfortable trying to add extra features on a broken library, so I don't know what to do...
I'm happy to have contributions to RtAudio. However, I would point out that RtAudio has been working fairly well for two decades. As messy as you may think it is, I don't want to break it. AND importantly, I want to keep things simple. So, you can submit PRs but I'm not interested in complete rewrites unless they make sense and hopefully reduce the number of lines of code.
Also, I'm not interested in having RtApi split. And I would look to the new branch (newdeviceselection) as a basis, as I will merge that as soon as I finish getting it to work for the Windows APIs. I would note that in implementing this new branch, I have generally been able to reduce the lines of code fairly significantly.
If you're not interested in splitting RtApi (I'm not exactly sure what you mean), I might fork RtAudio (and largely preserve API but not architecture or ABI), and test my changes using BambooTracker. Though I'll wait until newdeviceselection is finished or merged, before working more on this.
Using cpal as a reference, I estimate my proposed changes should maintain or decrease the lines of code.
I've finished the newdeviceselection port but it entails more API changes, so I am hoping to get feedback from others before merging it. Generally, I think it is a big improvement in terms of how devices are queried and enumerated.
I haven't done any serious work with device selection in exotracker. I might look into porting https://github.com/BambooTracker/BambooTracker to updated RtAudio, but I'm occupied elsewhere now.
If I have the time (unlikely), I still hope to fork RtAudio and rewrite a good chunk of the code from scratch. I've learned a lot about ALSA buffering (there are substantial changes I'd make to RtApiAlsa's backend code), but have yet to study PulseAudio or WASAPI and friends.
Sorry for not responding earlier. After months of rewrites, I finally have time to try upgrading RtAudio and seeing how it behaves. How does newdeviceselection compare to master right now (master was updated more recently)?
There may have been a few recent updates to master recently but newdeviceselection is the future. I just need to find time to merge it into master.
Playing with the newdeviceselection branch. Here's my preliminary feedback:
getDeviceCount()
is a footgun:
RtAudio::getDeviceInfo()
takes a device ID from RtAudio::getDeviceIds()
instead of an integer between 0 and < RtAudio::getDeviceCount()
. As a result, my old code (which passes ints between 0 and < getDeviceCount()
) compiles but fails at runtime. Is it worth renaming RtAudio::getDeviceInfo
, removing RtAudio::getDeviceCount()
, or changing the type signature to accept a struct instead of an unsigned int
, so previous code will raise a compiler error?
RtAudio
methods accept internal indexes into deviceList_
but instead device IDs. So making counting devices harder isn't a major problem.getDeviceIds()
and getDeviceNames()
allocate a new vector holding a subset of the data in std::vector<RtAudio::DeviceInfo> RtApi::deviceList_
. So removing getDeviceCount()
increases the allocation overhead of that use case.Redundant device probing:
Apparently RtApi::getDeviceCount()
and RtApi::getDeviceIds()
always call RtApiAlsa::probeDevices()
, but RtApi::getDeviceInfo()
only probes devices if deviceList_
is empty. I would've objected to getDeviceCount()
and getDeviceIds()
having different behavior.
getDeviceCount()
followed by getDeviceNames()
will probe devices twice, and may find a different set of results each time. Is this a footgun? Is there a better way for users to explicitly signal when to rescan or not?
getDeviceNames
, since (as indicated in probe.html) you can call getDeviceInfo
on each device ID... but this introduces O(n) linear scans with O(n^2) runtime... at this point you pray nobody has a freak computer with hundreds of devices.RtApi::deviceList_
and deviceIds_
(apparently its type varies by RtApi subclass, do the two arrays always remain in sync?), and RtApiAlsa::probeDevices()#deviceIds
and deviceNames
interact. I'll have to read the source more carefully and take notes (EDIT: I found the documentation at probe.html
, but it covers API rather than implementation)RtApiAlsa::probeDevices()
is called once internally by RtAudio::RtAudio()
-> RtApi::getDeviceNames()
, and again when my application code calls RtAudio::getDeviceIds()
-> RtApi::getDeviceIds()
. This probes devices twice, which doesn't need to happen.I noticed devices appear and disappear in RtApiAlsa::probeDevices()
's output when I closed and restarted my program, but I suspect that isn't a RtAudio or ALSA race condition, but merely PipeWire/WirePlumber occupying hw devices when audio is playing and freeing them up during moments of silence (as "power saving").
Observations from reading the ID system more carefully (only RtApiAlsa so far):
RtAudio has two different types of IDs: public unsigned int
RtApi IDs, and private backend-specific (eg. strings for RtApiAlsa) IDs. I'd prefer giving both variables distinct names, and/or defining an AlsaDeviceId
or RtApiAlsa::DeviceId
struct or typedef around std::string
.
RtApi::currentDeviceId_
is used to allocate public integer IDs. In theory this can overflow after plugging and unplugging 4 billion devices.std::vector<RtAudio::DeviceInfo> RtAudio::deviceList_
holds all properties except backend-specific IDs. std::vector< std::string > RtApiAlsa::deviceIds_
holds ALSA-specific IDs. An invariant is that both vectors must remain in sync at all times, adding and removing matching items to the same indexes. I would document this in the source code.
Within RtApiAlsa :: probeDevices()
, std::vector<std::string> deviceIds;
holds ALSA-specific IDs, and std::vector<std::string> deviceNames;
holds user-facing names exposed to backend-independent std::vector<DeviceInfo> RtAudio::deviceList_
. Both vectors are parallel arrays, pushed to at the same time. (Unlike RtApi[Alsa]
vectors, these local vectors could be merged into a vector of structs.)
You explicitly check for default
and pulse
(and not pipewire
), and otherwise only enumerate hw devices. Why not imitate aplay -L
and give the user all the ALSA devices (including unusable ones), or maybe filter out the unusable ones?
When devices are repeatedly probed/enumerated, existing devices keep their public integer ID, new devices receive new IDs from RtApi::currentDeviceId_++
and are pushed to RtAudio::deviceList_
and RtApiAlsa::deviceIds_
simultaneously, and removed devices are removed from RtAudio::deviceList_
and RtApiAlsa::deviceIds_
simultaneously.
RtAudio::deviceList_[].name
and deviceNames[]
(the user-facing name, for ALSA derived from sprintf( name, "%s (%s)", snd_ctl_card_info_get_name(ctlinfo), snd_pcm_info_get_id(pcminfo) )
). I sure hope no 2 devices have the same name (in my testing, my different HW devices all had different names). I assume ALSA lets you open the same device (like default
) in both input and output mode, and they're the same device in both cases.I found at least four distinct sites in RtAudio and application usage, which would cause $O(n^2)$ behavior in the number of devices present. Should this be removed, and does it cause noticeable slowdowns in practice?
// Fill or update the deviceList_ and also save a corresponding list of Ids.
iterates over deviceNames
then RtApi::deviceList_
.// Remove any devices left in the list that are no longer available.
iterates over deviceList_
then deviceNames
.RtApi::deviceList_.erase()
and RtApiAlsa::deviceIds_.erase()
up to deviceList_.size()
times.std::vector<unsigned int> RtAudio::getDeviceIds()
followed by RtAudio::getDeviceInfo(unsigned int deviceId)
on each ID returned, each of the N calls to RtApi::getDeviceInfo(unsigned int deviceId)
performs a linear scan over RtAudio::deviceList_
.(from my previous comment) RtApiAlsa::probeDevices()
is called once internally by RtAudio::RtAudio()
-> RtApi::getDeviceNames()
, and again when my application code calls RtAudio::getDeviceIds()
-> RtApi::getDeviceIds()
. This probes devices twice, which doesn't need to happen. Is there a better way for users to explicitly signal when to rescan or not?
Thanks for these observations and suggestions. I'm not sure when I'll have time to carefully consider them but I'm very happy to have the feedback. Currently, my approach was not to expect the user to specify a rescan but rather to do it automatically whenever a call is made that might need to involve a re-probing of devices (given hot-pluggable devices). If the APIs made it easier to know when new devices were added, things could be simplified but that is not currently the case.
The suggestion I'm most strongly for, is to remove getDeviceCount()
. That API is useless if you call it alongside getDeviceIds()
since the two function calls correspond to different device scans and may hold different data. And worse yet, it allows old previously-correct code to keep compiling and start doing the wrong thing (failing at runtime).
My other suggestions are less important. I have not observed two different devices with the same name so far (since RtApiAlsa looping over cards and device matches the final discovered list). And from what I hear, $O(n^2)$ in the number of devices isn't a major issue since most users will only have up to a few dozen devices present at a time.
I looked into updating an existing RtAudio app to no-exceptions master, and ran into some trouble (https://github.com/BambooTracker/BambooTracker/pull/426#issuecomment-991486279). This is preliminary and I didn't try actually fixing the code, mostly skimming the RtAudio code and diffs to plan out what I'd have to change. Sorry if I'm misunderstanding something that's already documented.
Currently, when
audio->openStream()
fails, BambooTracker shows a dialog with contents taken fromRtAudioError::getMessage()
. After the merge,RtAudioError
was removed,audio->openStream()
only returns an enum code (no error message string), andRtApi :: error()
only reports the message by calling thestd::function<...> RtApi::errorCallback_
callback. To my knowledge, you can't access the protectedRtApi::errorText_
field without patching RtAudio to expose it through a getter or similar.If you upgrade to no-exceptions and want to keep printing error messages, (I think) you would have to set a callback. The issue is that
RtApi :: error()
(which calls your callback) gets called on both on the UI and audio threads, and your callback needs to act differently based on which thread it's running on.One idea is to edit
AudioStreamRtAudio::initialize()
, replace the start of thetry
block withaudio->setErrorCallback(...);
, replace the end of the block withaudio->setErrorCallback({});
to reset it, and pray that the audio thread doesn't start and callRtApi :: error()
before we callsetErrorCallback({})
. I'm not sure that's the case.If I understand correctly,
RtApi :: openStream()
calls (eg.)RtApiAlsa :: probeDeviceOpen()
which callspthread_create
(which as far as I can tell, begins execution immediately, rather than suspending the thread). In the case of ALSA,alsaCallbackHandler
callsRtApiAlsa :: callbackEvent()
which (in case ofSTREAM_STOPPED
) waits forAlsaHandle::runnable
to become true (it's initially false). I think RtAudio defaults toSTREAM_STOPPED
. As a result,RtApiAlsa
can't continue executing and callRtApi :: error()
before the stream is started.I'm not sure if this will remain the case, and I didn't study every other backend to ensure they share this property (that if you call
RtApi::setErrorCallback({})
whenRtAudio :: openStream()
returns, thenRtApi :: error()
cannot possibly be called on the newly spawned audio thread). I thinkRtApiPulse
is the same way.If (in theory) the audio thread starts up, but immediately fails and calls
RtApi :: error()
, there's an endless zoo of race conditions between the GUI thread callingRtApi::setErrorCallback({})
and resettingerrorCallback_
, the audio thread readingerrorCallback_
to check whether to call it (this is a data race), and the audio thread callingerrorCallback_
which tries to show a message box on the wrong thread (cue firework explosions).One possible way out is to set a callback which checks the current thread ID and only opens a message box if running on the GUI thread.