andabi / music-source-separation

Deep neural networks for separating singing voice from music written in TensorFlow
795 stars 150 forks source link

Deep Neural Network for Music Source Separation in Tensorflow

This work is from Jeju Machine Learning Camp 2017

Intro

Recently, deep neural networks have been used in numerous fields and improved quality of many tasks in the fields. Applying deep neural nets to MIR(Music Information Retrieval) tasks also provided us quantum performance improvement. Music source separation is a kind of task for separating voice from music such as pop music. In this project, I implement a deep neural network model for music source separation in Tensorflow.

Implementations

Requirements

Usage

[Related Paper] Singing-Voice Separation From Monaural Recordings Using Deep Recurrent Neural Networks (2014) [3]

Proposed Methods

Overall process

Model

Loss

Experiments

Settings

[Related Paper] Music Signal Processing Using Vector Product Neural Networks (2017) [1]

Approach

Context-windowed Transformation (WVPNN)

Loss

Evaluation Metric

GNSDR, GSIR, GSAR are used.

Results

References

  1. Zhe-Cheng Fan, Tak-Shing T. Chan, Yi-Hsuan Yang, and Jyh-Shing R. Jang, "Music Signal Processing Using Vector Product Neural Networks", Proc. of the First Int. Workshop on Deep Learning and Music joint with IJCNN, May, 2017
  2. P.-S. Huang, M. Kim, M. Hasegawa-Johnson, P. Smaragdis, "Joint Optimization of Masks and Deep Recurrent Neural Networks for Monaural Source Separation", IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 23, no. 12, pp. 2136–2147, Dec. 2015
  3. P.-S. Huang, M. Kim, M. Hasegawa-Johnson, P. Smaragdis, "Singing-Voice Separation From Monaural Recordings Using Deep Recurrent Neural Networks" in International Society for Music Information Retrieval Conference (ISMIR) 2014.
  4. Tohru Nitta, "A backpropagation algorithm for neural networks based an 3D vector product. In Proc. IJCNN", Proc. of IJCAI, 2007.