WuYiming6526 / HARD

Harmony-Rhythm Disentanglement audio remixer plugin
MIT License
185 stars 7 forks source link

HARD

HARD-UI

HArmony-Rhythm Disentanglement audio remixer plugin.

This repository is a submission to the Neural Audio Plugin Competition (https://www.theaudioprogrammer.com/neural-audio).


How to install

  1. Download HARD-AUplugin.zip from here.
  2. Unzip the zip file and copy HARD.component to your AU plugin installation path. Typically the installation path is /Users/[Your Username]/Library/Audio/Plug-Ins/Components/ or /Library/Audio/Plug-Ins/Components/.

How to use

**Note: Make sure your DAW is running at sample rate 44.1kHz.

  1. Create two audio tracks in your DAW and load a music audio clip into each track,
  2. Synchronize the two audio clips using the audio time-stretching feature in your DAW,
  3. Insert the HARD plugin to one of the audio tracks (If "Apple cannot check app for malicious software" notification shows up, manually allow the plugin from System Settings -> Security & Privacy, then restart the DAW)
  4. Send the output of the other audio track to the sidechain channel of the HARD plugin,
  5. Start playback and enjoy!

You can control audio generation by moving the sliders.


How to build

This repository contains the entire XCode project.

  1. Clone the repository using the following command:
    git clone --recursive https://github.com/WuYiming6526/HARD.git
  2. Download the model file from https://www.dropbox.com/s/ndd7rrrljjccqfh/morpher.onnx?dl=0. Copy the downloaded file morpher.onnx to the root directory.
  3. Open the XCode project at Builds/MacOSX/HARD.xcodeproj
  4. Build the project

The built AU plugin file is automatically copied to /Users/[Your Username]/Library/Audio/Plug-Ins/Components/ when the build process is finished. If your DAW cannot find HARD, move the plugin to the other installation path.

How it works

HARD-VAE

I trained a VAE that encodes an audio spectrogram into disentangled latent features representing the harmonic (=pitch-related) and rhythmic (=pitch-invariant) content of the input audio. The decoder then generates an audio spectrogram from the latent features. By interpolating the latent features using the horizontal sliders, you can separately change the harmonic and rhythmic content of the generated audio.