TUD-STKS / TargetOptimizer

TargetOptimizer
GNU General Public License v3.0
1 stars 0 forks source link

TargetOptimizer

TargetOptimizer is a free and open-source PC software in C++ originally written by Patrick Schmager to estimate pitch targets according to the Target Approximation Model by Yi Xu.

Hence, it is similar to PENTAtrainer, but differs in the following ways:

The motivation for these differences are described in Birkholz P, Schmager P, Xu Y (2018). Estimation of Pitch Targets from Speech Signals by Joint Regularized Optimization. In: Proc. of the 26th European Signal Processing Conference (EUSIPCO 2018), pp. 2089-2093, Rome, Italy. pdf

TargetOptimizer 2.0.1 is an extension of TargetOptimizer. The performance in terms of minimizing RMSE during estimation of pitch targets has been greatly improved by:

Various additions in terms of functionality have been implemented, such as increasing overall performance by adding features like early-stopping during parameter optimization or adding the possibility to work on contours different from pitch contours.

Benchmark results and added features are described in Paul Konstantin Krug, Simon Stone, Alexander Wilbrandt, and Peter Birkholz (2021). Targetoptimizer 2.0: Enhanced estimation of articulatory targets. pdf

The software can be executed as a command line tool (without a GUI, to support batch processing) or as an application with a GUI.

Build for Windows using Visual Studio 2019+

Simply open the solution TargetOptimizer.sln and build in dependency of desired use case:

  1. Command line tool ("Release")
  2. GUI ("wxWidgets_Release") when using wxWidgets
  3. GUI ("wxWidgets_VCPKG_Release") -> when using VCPKG and wxWidgets

Build for Linux using GCC:

Navigate inside the Sources folder and run one of the following commands.

For the GUI version (requires wxWidgets):

g++ -std=c++17 -O3 -D USE_WXWIDGETS -I.. ../dlib/all/source.cpp -fopenmp -fpermissive -lpthread -lX11 *.cpp `wx-config --cxxflags --libs std` -o TargetOptimizer -w -lstdc++fs

For the command-line-only version:

g++ -std=c++14 -O3 -I.. ../dlib/all/source.cpp -fopenmp -lpthread -lX11 *.cpp -o TargetOptimizer

Run TargetOptimizer from the command line: Run TargetOptimizer -h for instructions.

Using TargetOptimizer 2.0.1

Be aware that only a short introdruction can be given at this place. For a more detailed description on how to use TO2 please refer to the manual!

A screenshot of the GUI is shown below (for the German word "Betriebssportgemeinschaft").

Screenshot Target Optimizer 2.0.1

Following steps are necessary to successfully extract targets for an utterance (Step 1 and 2 can be done simultaneously if a TextGrid file is used):

  1. Input the boundaries (which can be loaded in terms of a Praat TextGrid file by pressing "Open File(s)" or manually initialized by selecting the amount of desired boundaries and pressing "Init bounds")
  2. Input the contour to work on (which can be loaded in terms of a Praat PitchTier file by pressing "Open File(s)")
  3. Carefully think about the parameters you want to change (as mentioned above, default values are generally well suited)
  4. Press "Optimize Targets"
  5. Export the results as a CSV file, a gestural score for VocalTractLab, or as a Praat PitchTier file with the model f0 contour by pressing "Save as..."

How to cite:

If you use the TargetOptimizer in a publication, we would appreciate if you cite the following paper:

Krug, P., Stone, S., Wilbrandt, A. and Birkholz, P., 2021. TargetOptimizer 2.0: Enhanced Estimation of Articulatory Targets. In: Studientexte zur Sprachkommunikation: Elektronische Sprachsignalverarbeitung 2021. TUDPress, Dresden, Germany.