anira-project / anira

an architecture for neural network inference in real-time audio applications
https://doi.org/10.1109/IS262782.2024.10704099
Apache License 2.0
115 stars 5 forks source link
audio audio-processing deep-learning libtorch onnxruntime real-time tensorflow-lite

anira Logo

Build Status

anira is a high-performance library designed to enable easy real-time safe integration of neural network inference within audio applications. Compatible with multiple inference backends, LibTorch, ONNXRuntime, and Tensorflow Lite, anira bridges the gap between advanced neural network architectures and real-time audio processing. In the paper you can find more information about the architecture and the design decisions of anira, as well as extensive performance evaluations with the built-in benchmarking capabilities.

Features

Usage

An extensive anira usage guide can be found here.

The basic usage of anira is as follows:

#include <anira/anira.h>

// Create a model configuration struct for your neural network
anira::InferenceConfig myNNConfig(
    "path/to/your/model.onnx (or *.pt, *.tflite)", // Model path
    {2048, 1, 150}, // Input shape
    {2048, 1}, // Output shape
    42.66f // Maximum inference time in ms
);

// Create a pre- and post-processor instance
anira::PrePostProcessor myPrePostProcessor;

// Create an InferenceHandler instance
anira::InferenceHandler inferenceHandler(myPostProcessor, myNNConfig);

// Create a HostAudioConfig instance containing the host config infos
anira::HostAudioConfig audioConfig {
    1, // currently only mono is supported
    bufferSize,
    sampleRate
};

// Allocate memory for audio processing
inferenceHandler.prepare(audioConfig);

// Select the inference backend
inferenceHandler.selectInferenceBackend(anira::LIBTORCH);

// Optionally get the latency of the inference process in samples
int latencyInSamples = inferenceHandler.getLatency();

// Real-time safe audio processing in process callback of your application
processBlock(float** audioData, int numSamples) {
    inferenceHandler.process(audioData, numSamples);
}
// audioData now contains the processed audio samples

Install

CMake

anira can be easily integrated into your CMake project. Either add anira as a submodule or download the pre-built binaries from the releases page.

Add as a git submodule

# Add anira repo as a submodule
git submodule add https://github.com/anira-project/anira.git modules/anira

In your CMakeLists.txt, add anira as a subdirectory and link your target to the anira library:

# Setup your project and target
project(your_project)
add_executable(your_target main.cpp ...)

# Add anira as a subdirectory
add_subdirectory(modules/anira)

#Link your target to the anira library
target_link_libraries(your_target anira::anira)

With pre-built binaries

Download the pre-built binaries from your operating system and architecture from the releases page.

# Setup your project and target
project(your_project)
add_executable(your_target main.cpp ...)

# Add the path to the anira library as cmake prefix path and find the package
list(APPEND CMAKE_PREFIX_PATH "path/to/anira")
find_package(anira REQUIRED)

# Link your target to the anira library
target_link_libraries(your_target anira::anira)

Build from source

You can also build anira from source using CMake. All dependencies are automatically installed during the build process.

git clone https://github.com/anira-project/anira
cmake . -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build --config Release --target anira

Build options

By default, all three inference engines are installed. You can disable specific backends as needed:

The method of thread synchronization can be chosen between hard real-time safe raw atomic operations and an option with semaphores. The option with semaphores allows the use of wait_in_process_block in the InferenceConfig class. The default is the raw atomic operations. To enable the semaphore option, use the following flag:

Moreover the following options are available:

Documentation

For using anira to inference your custom models, check out the extensive usage guide. If you want to use anira for benchmarking, check out the benchmarking guide and the section below. Detailed documentation on anira's API and will be available soon in our upcoming wiki.

Benchmark capabilities

anira allows users to benchmark and compare the inference performance of different neural network models, backends, and audio configurations. The benchmarking capabilities can be enabled during the build process by setting the -DANIRA_WITH_BENCHMARK=ON flag. The benchmarks are implemented using the Google Benchmark and Google Test libraries. Both libraries are automatically linked with the anira library in the build process when benchmarking is enabled. To provide a reproducible and easy-to-use benchmarking environment, anira provides a custom Google benchmark fixture anira::benchmark::ProcessBlockFixture that is used to define benchmarks. This fixture offers many useful functions for setting up and running benchmarks. For more information on how to use the benchmarking capabilities, check out the benchmarking guide.

Examples

Build in examples

Other examples

Real-time safety

anira's real-time safety is checked in this repository with the rtsan sanitizer.

Citation

If you use anira in your research or project, please cite either the paper our the software itself:

@inproceedings{ackvaschulz2024anira,
    author={Ackva, Valentin and Schulz, Fares},
    booktitle={2024 IEEE 5th International Symposium on the Internet of Sounds (IS2)},
    title={ANIRA: An Architecture for Neural Network Inference in Real-Time Audio Applications}, 
    year={2024},
    volume={},
    number={},
    pages={1-10},
    publisher={IEEE},
    doi={10.1109/IS262782.2024.10704099}
}

@software{ackvaschulz2024anira,
    author = {Valentin Ackva and Fares Schulz},
    title = {anira: an architecture for neural network inference in real-time audio application},
    url = {https://github.com/anira-project/anira},
    version = {x.x.x},
    year = {2024},
}

Contributors

License

This project is licensed under Apache-2.0.