ASIO Audio Driver Support In ESDR3 for Lowest Possible Latency

I have been thinking about how having Direct ASIO audio driver support coded into ESDR3 could make a big improvement in audio latency.

What Is ASIO? ASIO stands for Audio Stream Input/Output.

ASIO is a sound card driver protocol created by the German music company Steinberg. The reason ASIO was created was out of necessity, when a musician wants to record with a computer if there is too much latency there will be such a delay in the audio the drummer would not be able to keep time or the singer would be out of time basically just a mess. What was needed was Real-Time audio with negligible latency. That’s where the ASIO driver comes in.

One of the biggest issues we have with computer based SDR radio is latency. Latency can come from many areas in the system such as Video, CPU utilization and so on but for the most part most of the latency is caused by the audio stack in an OS like Windows. None of this latency matters when you are listening to music or getting a notification sound when you get an email. But when you need the audio to be Real-Time the old MME, WASAPI drivers just won’t work.

Now since ESDR3 is an SDR application running on a personal computer it makes sense to me to add ASIO driver support right into ESDR3 so users can add a low latency ASIO audio interface to their system for the lowest possible latency and the highest audio fidelity. The added benefit to very low latency is even more apparent with CW, SO2R, QSK and Digital modes where low latency and wide audio frequency response is critical.

The best alternative for applications that need low latency is to use the ASIO (Audio Stream Input/Output) model, which utilizes exclusive mode. After a user installs a 3rd party ASIO driver, applications can send data directly from the application to the ASIO driver. However, the application has to be written in such a way that it talks directly to the ASIO driver.

The image below shows why there is so much latency on for example a Windows PC. Look at all the steps the audio has to pass through the Windows Audio Stack. But with ASIO audio is sent directly the ASIO driver.

The following diagram shows a simplified version of the Windows audio stack low-latency-audio-stack-diagram-1

Here is a summary of the latencies in the render path:

The application writes the data into a buffer
The Audio Engine reads the data from the buffer and processes it. It also loads audio effects in the form of Audio Processing Objects (APOs). For more information about APOs, see Windows Audio Processing Objects.
The latency of the APOs varies based on the signal processing within the APOs.
Before Windows 10, the latency of the Audio Engine was equal to ~12ms for applications that use floating point data and ~6ms for applications that use integer data
In Windows 10, the latency has been reduced to 1.3ms for all applications
The Audio Engine writes the processed data to a buffer.
Before Windows 10, this buffer was always set to ~10ms.
Starting with Windows 10, the buffer size is defined by the audio driver (more details on this are described later in this topic).
The Audio driver reads the data from the buffer and writes them to the H/W.
The H/W also has the option to process the data again (in the form of additional audio effects).
The user hears audio from the speaker.

Here is a summary of latency in the capture path:

Audio is captured from the microphone.
The H/W has the option to process the data (i.e. to add audio effects).
The driver reads the data from the H/W and writes the data into a buffer.
Before Windows 10, this buffer was always set to 10ms.
Starting with Windows 10, the buffer size is defined by the audio driver (more details on this below).
The Audio Engine reads the data from the buffer and processes them. It also loads audio effects in the form of Audio Processing Objects (APOs).
The latency of the APOs varies based on the signal processing within the APOs.
Before Windows 10, the latency of the Audio Engine was equal to ~6ms for applications that use floating point data and ~0ms for applications that use integer data.
In Windows 10, the latency has been reduced to ~0ms for all applications.
The application is signaled that data is available to be read, as soon as the audio engine finishes with its processing. The audio stack also provides the option of Exclusive Mode. In that case, the data bypasses the Audio Engine and goes directly from the application to the buffer where the driver reads it from. However, if an application opens an endpoint in Exclusive Mode, then there is no other application that can use that endpoint to render or capture audio.

Thanks KC2QMA

ExpertSDR3 / ExpertSDR3-SUGGESTIONS

ASIO Audio Driver Support In ESDR3 for Lowest Possible Latency #42