bronichern / DeepFry

MIT License
10 stars 2 forks source link

DeepFry: Identifying Vocal Fry Using Deep Neural Networks

This repository is for the paper DeepFry: Identifying Vocal Fry Using Deep Neural Networks by Bronya R. Chernyak, Talia Ben Simon, Yael Segal, Jeremy Steffman, Eleanor Chodroff, Jennifer S. Cole, Joseph Keshet.
It contains code for predicting creaky voice, as well as pre-trained models.

We provide two pre-trained models:

This repository enables you to identify creaky frames in a given audio, see details below.

Requirements and Installation

Conda (Linux)

conda env create -f environment.yml

Custom Installation

Identifying Creak (Running the repo)

There are two options to run this repository:

  1. Run on a directory with wav files without corresponding annotated textgrids.
  2. Run on a directory with wav files and their corresponding textgrids.

Identify creak - ALLSTAR dataset

This options allows you to test the repository. In the folder 'allstar' you will find wav files with their corresponding textgrids, which we used to test our model on, as specified in the paper.
Note that the results in the paper were reported for 20ms to have a proper comparison between methods, while our model was trained on 5ms, so the measures here might differ slighly.

Run option #1: Only output measures:

python run.py --data_dir allstar --model_name model_path --cuda

Run option #2: Output measures & Write predictions to a textgrid:

python run.py --data_dir allstar --model_name model_path  --out_dir out_path --cuda

where model_path is the absolute path to the pre-trained model, and out_dir is the path to the directory in which the textgrids will be saved to with the predictions of the model.

Identify Creak - custom dataset - no annotations

python run.py --data_dir data_path --model_name model_path --out_dir out_path --custom --cuda

Where model_path and out_dir is the same as above and data_path is the absolute path to a directory with wav files in which creak should be identified.

Identify Creak - custom dataset - with annotations

Where model_path and out_dir is the same as above and data_path is the absolute path to a directory with wav files in which creak should be identified alongside their corresponding textgrids.