RichardHladik / outotune

An opensource harmonizer implementation leveraging the DISTRHO Plugin Framework.
http://uralyx.cz/prog/outotune
66 stars 3 forks source link
harmonizer realtime-audio speech-analysis

Outotune

An opensource harmoniser implementation for LV2 and VST leveraging the DISTRHO Plugin Framework and the WORLD speech analysis, manipulation and synthesis system

What is it?

Outotune has been inspired by Jacob Collier's harmonizer, which can be seen in action here or here.

It allows you to become a one-man choir by analysing your voice and resynthesizing it at different pitches. It is controlled by MIDI, which basically means you can sing and accompany yourself by playing chords on your keyboard, and get a full choir of yous.

See Technical details for more detailed description and documentation.

Why the name?

Originally, Outotune was meant to be an opensource Autotune/Melodyne implementation, but it evolved into a harmoniser. Outotune comes from "Out o' tune", which is pretty self-explanatory.

Dependencies

For example on Debian, install libfftw3-dev libgl-dev libjack-dev.

Building on platforms other than Linux is possible, but not tested.

Building and running

First ensure that you have git submodules initialised by running

git submodule init
git submodule update

Also make sure you have all the needed dependencies installed (see above).

Then simply cd to the correct directory and run make:

cd plugins/outotune
make

By default, make will build outotune for all the supported plugin formats, this currently means LV2, VST2 and a stand-alone JACK executable.

Other useful commands are:

Usage

Outotune has 3 ports: mono input, MIDI input and mono output. The set of currently active notes is maintained by listening to MIDI Note On and Note Off events. In each period, Outotune analyses the the audio in the input buffer, synthesizes a voice for each active note, and muxes them into a single output. In the LV2 plugin, the instantaneous estimated fundamental frequency is available as an output parameter pitch.

Modes

Outotune has two modes which control how the MIDI events are interpreted:

GUI

GUI

The GUI consists of two parts: frequency graph and a top bar.

The frequency graph shows the estimated pitch as a function of time. The estimated pitch is shown in red, black areas represent the places where no pitch was detected.

The top bar contains three clickable toggles for controlling the application. Each of these toggles can also be triggered by a keyboard shortcut.

Known limitations

Currently, Outotune is computationally expensive. WORLD, the underlying speech analysis and synthesis system does not seem to be optimised for our workflow – in fact, real-time analysis is not officially supported (although real-time synthesis is). This may be improved by refactoring WORLD, mainly by reducing repetitive allocations and making better use of FFTW3 plans. Potential improvements can also be made by reducing the amount of data WORLD has to compute in each period.

Direct result of this is latency. For small enough buffer sizes, reducing the buffer size does not substantially reduce the work done in one period, which effectively means the smaller the buffer size, the more computationally expensive Outotune becomes. Latency is then directly determined by the smallest buffer size with which Outotune can still run smoothly on the machine.