Yosi Shrem (joseph.shrem@campus.technion.ac.il),\ Felix Kreuk,\ Joseph Keshet (jkeshet@technion.ac.il).
FormantsTracker is a software package for Formant Tracking and Estiamtion using deep learning.
We propose a new modeling for measuring the formants' frequencies using probabilistic heat-maps rather than traditional regression. This technique allows for flexibility in the predictions to support both in-distribution and out-of-distribution (OOD) samples with greater precision.
The paper was present at Interspeech 2022 - Formant Estimation and Tracking using Probabilistic Heat-Maps. If you find our work useful please cite :
@article{shrem2022formant,
title={Formant Estimation and Tracking using Probabilistic Heat-Maps},
author={Shrem, Yosi and Kreuk, Felix and Keshet, Joseph},
journal={arXiv preprint arXiv:2206.11632},
year={2022}
}
conda create -n FormantsTracker python=3.9 -y
conda activate FormantsTracker
git clone https://github.com/MLSpeech/FormantsTracker.git
cd FormantsTracker
pip install -r requirements.txt
You can either set the paths for the run (opt1) or use the default values (opt2). The generated predictions are for every 10ms frame.
test_dir
and predictions_dir
as arguments.
For example:
python main.py test_dir=<data_dir_path> predictions_dir=<predictions_dir_path>
./conf/config.yaml
.
.wav
files in the ./test_dir/
directory. python main.py
./predictions
directory..wav
files, there is no need to re-arrange your data.
For example:
./data
└───dir1
│ │ 1.wav
│ │ 2.wav
│ │ 3.wav
│ │
└───dir2
│ 1.wav
│ 2.wav
│ 3.wav