janniss91 / SpeedyPanther

An repo that contains different Speech Processing projects and applications.
0 stars 0 forks source link

Perform phone classification #4

Open janniss91 opened 1 year ago

janniss91 commented 1 year ago

Use the sample from the TIMIT dataset to perform phone classification.

  1. Read the TIMIT data and make it possible to query for all occurrences of the same phones
  2. Find a simple way of classifying phones (kNN?)
    • for that the phones will be split into short analysis windows and then transformed to the frequency domain
    • once this is done, kNN could be applied and a majority vote among all windows could be used to determine the phone type
  3. Move the code for phone prediction from the notebook to python files and do some refactorings.
    • create PhonePrediction class (factory?) to allow for different classification techniques (not only kNN)
    • put hyperparameter tuning in separate file

potential problems:

janniss91 commented 1 year ago

Phone prediction per frame with a kNN let to a ~30% accuracy. For full phones (all frames of a phone combined) using a majority vote, kNN predictions let to a ~73% accuracy.

These values could be attained in spite of the small number of speakers and different dialects.