fgnt / pb_bss

Collection of EM algorithms for blind source separation of audio signals
MIT License
270 stars 60 forks source link
beamforming bss em-algorithm multi-channel source-separation speech-enhancement speech-processing

Blind Source Separation (BSS) algorithms

Build Status Azure DevOps tests Azure DevOps coverage MIT License

This repository covers EM algorithms to separate speech sources in multi-channel recordings.

In particular, the repository contains methods to integrate Deep Clustering (a neural network-based source separation algorithm) with a probabilistic spatial mixture model as proposed in the Interspeech paper "Tight integration of spatial and spectral features for BSS with Deep Clustering embeddings" presented at Interspeech 2017 in Stockholm.

@InProceedings{Drude2017DeepClusteringIntegration,
  Title                    = {Tight integration of spatial and spectral features for {BSS} with Deep Clustering embeddings},
  Author                   = {Drude, Lukas and and Haeb-Umbach, Reinhold},
  Booktitle                = {INTERSPEECH 2017, Stockholm, Sweden},
  Year                     = {2017},
  Month                    = {Aug}
}

Installation

Install it directly from source

git clone https://github.com/fgnt/pb_bss.git
cd pb_bss
pip install --editable .

We expect that numpy, scipy and cython are installed (e.g. conda install numpy scipy cython or pip install numpy scipy cython).

The default option is to install only the necessary dependencies. When you want to run the tests or execute the notebooks, use the one of the following commands for the installation:

pip install --editable .[all]  # Without a whitespace between `.` and `[all]`
pip install git+https://github.com/fgnt/pb_bss.git#egg=pb_bss[all]